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EDITORIAL 


Revolutionary technologies 


n this issue of Science, we present reviews of four 
technologies whose power and rapid growth across 
biological research communities make them revo- 
lutionary (see page 864). New technology is one 
of the most powerful drivers of scientific progress. 
For example, the earliest microscopes magnified 
images only 50-fold at most. When the Dutch 
fabric merchant and amateur scientist Antonie van 
Leeuwenhoek developed microscopes with more than 


of biochemistry and cell biology. Technical innovations, 
improved commercial and shared-facility instrumenta- 
tion, and powerful software continue to drive the x-ray 
crystallography revolution. 

As the field of recombinant DNA technology was 
evolving, a revolutionary technique in the form of the 
polymerase chain reaction (PCR) was developed by 
Kary Mullis (from 1983 to 1985) at the biotechnology 
company Cetus. PCR allows tremendous amplification 


Editor-in-Chief, 
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200-fold magnifications (likely to examine cloth), he | of specific DNA sequences. It had an almost immediate pieerieieli 
used them to study many items, including pond water | revolutionary impact on many fields, including gene nee ae 
and plaque from teeth. His observations of “animal- | cloning and DNA analysis, and forms the foundation 
cules” led to fundamental of many methods in mod- 
discoveries in microbiol- ern molecular biology. PCR 
ogy and cell biology, and depends on several under- 
spurred the elaboration of lying technologies, includ- 
improved microscopes. To- ing the chemical synthesis 
day, various light micro- of short sequences of DNA 
scopes remain prime tools and the availability of ap- 
in modern biology. This propriate enzymes, but 
example embodies two char- also machines for program- 
acteristics of a revolutionary mable temperature cycling. 
technology: a capability for The method was invented 
addressing questions better shortly before I was setting 
than extant technologies, up my first independent 
and the possibility of be- laboratory. Cetus had part- 
ing utilized and adapted by nered with one company 
many other investigators. to sell PCR machines, al- 
The discovery of x-rays in though other devices with 
1895 ushered in a multifac- similar capabilities were 
eted revolution in imaging. available. I remember call- 
As scientists sought to un- ing one of the other com- 
derstand the nature of these panies and asking if its 
electromagnetic waves, they “New technology is one machine would work for 
realized that they were dif- ‘ PCR. Concerned about pat- 
fracted by crystals, estab- of the most powerful dr wers ent issues, the sales repre- 
lishing that the wavelengths of scientific progress.” sentative said, “I can’t say, 
of x-rays were comparable but no one has said that it 
to the separation between didn’t work for their par- 
atoms in crystals. In 1913, William Henry Bragg and | ticular application!” My lab joined the PCR revolution. 
his son William Lawrence Bragg found that diffraction The reviews in this issue of Science focus on two im- 
patterns could be interpreted to reveal the arrange- | aging methods that are extending and complementing 
ment of atoms in a crystal. The Braggs determined | the powers of traditional light microscopy and x-ray 
the structures of many simple substances, including | crystallography, and two methods for manipulating 
table salt and diamond. Others began using similar | DNA to drive a range of discoveries and potentially 
techniques to reveal more complex structures of inor- | powerful applications. Such technologies can help to 
ganic and organic compounds. In the late 1950s, these | resolve long-standing questions and can open up new 
methods were extended to determine the structure of | vistas, revealing new phenomena and allowing the for- 
proteins, and eventually to larger proteins and protein | mulation of questions previously unimagined. 
complexes. Thousands of structures are now reported 
each year and are foundational to our understanding —Jeremy Berg 
10.1126/science.aav1775 
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664 | do not want to lie to myself anymore. 99 


Nicolas Hulot, who resigned his position as France's environment 
minister live on France Inter radio to protest the “ministeps” taken by his 
government and others to slow global warming. 


IN BRIEF : : 
Edited by Catherine Matacic 


POPULAR CULTURE 


Wisecracker doesn’t crack NIH chief 


Comedian Sacha Baron Cohen (left) quizzes National Institutes of Health Director Francis Collins 
(right) on HIV/AIDS and trans fats. 


ritish comedian Sacha Baron Cohen has duped many public 

figures into making fools of themselves with his spoof shows 

and interviews. But Francis Collins, director of the U.S. National 

Institutes of Health in Bethesda, Maryland, quickly caught 

on when he was interviewed for an episode of Baron Cohen’s 

Who is America?, which aired 19 August on Showtime. The 
trickster—dressed as wheelchair-driving southerner Billy Wayne 
Ruddick, Jr. Ph.D.—asks Collins why “big agriculture” is spiking food 
with trans fats to make people transgender. Collins gamely replies that 
trans fats contribute to heart attacks and stroke, but have nothing to 
do with gender. When Baron Cohen describes an experiment to prove 
whether AIDS exists—one that involved taking a sample of his own 
blood using a needle from an HIV-positive homeless person—Collins 
gives him a stern talking-to: “You might want to wait 6 weeks and then 
have a test and see if you turned positive.” Collins, who was told he was 
doing a Showtime interview with a nonexpert, says he knew “a few 
minutes in” that it was a ruse. “I was pretty irritated from having been 
misled.” But ever the dutiful public servant, he decided to keep going— 
and get across whatever public health messages would stick. 
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STD cases set record 


PUBLIC HEALTH | Cases of the sexually 
transmitted diseases (STDs) chlamydia, 
gonorrhea, and syphilis hit a record-setting 
2.3 million in the United States in 2017, 

a 10% increase over 2016, the Centers for 
Disease Control and Prevention (CDC) in 
Atlanta reported on 28 August. The 2017 
data, which are preliminary, mark the fourth 
consecutive year of sharp increases in these 
STDs. Since 2013, gonorrhea diagnoses have 
increased 67% and nearly doubled among 
men; diagnoses of the early stages of syphilis 
have grown by 76%; and cases of chlamydia 
grew from 1.4 million to 1.7 million—nearly 
half of them in women 15 to 24 years 

old. “The systems that identify, treat, and 
ultimately prevent STDs are strained to the 
near-breaking point,” says Jonathan Mermin, 
director of the CDC center responsible for 
STD prevention. “We are sliding backward.” 


Former CDC director arrested 


#METOO | Tom Frieden, the physician who 
directed the U.S. Centers for Disease Control 
and Prevention in Atlanta from 2009 to 2017, 
was arrested on 24 August in New York City 
and charged with forcible touching of inti- 
mate parts, sexual abuse in the third degree, 
and harassment in the second degree. He is 
accused of grabbing the buttocks of a female 
friend without permission at his home 

in October 2017, making her “alarmed and 
annoyed.” Frieden, 57, pleaded not guilty, 
surrendered his passport, and was released 
without bail by Judge Michael Yavinsky 

of the Kings County Criminal Court. 

José Castro, CEO of the New York City 
nonprofit Vital Strategies, where Frieden last 
year launched the public health initiative 
Resolve to Save Lives (Science, 22 September 
2017, p. 1217), wrote in a statement that 
the accuser is a 30-year friend of the Frieden 
family, and that the former director “has 

the highest ethical standards both person- 
ally and professionally.” Frieden’s next court 
appearance is scheduled for 11 October. 


Global wind mapper launched 


METEOROLOGY | After nearly 2 decades 
of preparation, the European Space 
Agency (ESA) launched its €480 million 


sciencemag.org SCIENCE 


PHOTO: SHOWTIME 


80g ‘Zz Jaquwajdas uo /Hio bewesusa!ds'a0ua!0s//:d}]y wos pepeojuMoG 


PHOTO: IMAGEBROKER/ALAMY STOCK PHOTO 


FORESTRY 3 


Climate change is making t bes bigge 


s global temperatures rise, trees around the world are experi- 
encing longer growing seasons, sometimes as much as three 
extra weeks a year. All that time helps trees grow faster. 
But a study of the forests of Central Europe suggests higher 
temperatures—combined with pollution—are making wood 
weaker. Scientists analyzed core samples from four species 
of century-old trees, including the European beech (above), and 


wind-measuring satellite, Aeolus, 

into space last week from French Guiana. 
Named for the “keeper of the winds” 

in Greek mythology, Aeolus is the first 
satellite to measure winds directly. It will 
blast Earth’s atmosphere with a laser 
and catch photons reflected off air mol- 
ecules, allowing the winds’ height, speed, 
and direction to be detected. The mis- 
sion should improve climate models and 
weather forecasts, especially in the tropics, 
where few weather balloons are deployed. 
ESA selected Aeolus for flight in 1999, 
but engineering its high-power laser was 
challenging and caused delays. 


Ago for magic mushroom study 


BIOMEDICINE | A phase II trial to treat 
depression with psilocybin, the active 
ingredient in magic mushrooms, just got 
the go-ahead from the U.S. Food and Drug 
Administration. Preliminary studies 

have suggested the psychedelic substance, 
a schedule 1 drug in the United States, 
can help alleviate depressive symptoms. 
The new trial, set to start next week in the 
United Kingdom, will give 216 patients 
who have not responded to other therapies 
a single dose of the compound: either 1, 
10, or 25 milligrams. A spokesperson for 
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the company running the trial, Compass 
Pathways, says results are expected toward 
the end of 2019. If positive, the spokes- 
person adds, Compass will likely conduct a 
phase III trial, the last step before licensing. 
Psilocybin is also being tested in clinical 
trials to treat migraines, nicotine addiction, 
and obsessive-compulsive disorder. 


Carbon plan sinks Australian PM 


CLIMATE SCIENCE | Disagreement over 
Australia’s efforts to cut carbon emis- 

sions led the ruling coalition to dump 
Prime Minister Malcolm Turnbull last 
week. Turnbull, leader of the Liberal and 
National parties’ coalition since 2015, 
recently proposed an energy bill calling for 
Australia to cut greenhouse gas emissions 
by 26% from 2005 levels by 2030, in line 
with the country’s Paris agreement target. 
Conservative coalition members, many 

of whom want Australia to withdraw from 
the climate treaty, pushed Turnbull into 
dropping the target; but they still forced a 
leadership vote that ended in a victory for 
Scott Morrison, a friend of Australia’s coal 
industry. This is the fourth time in 10 years 
that squabbles over energy and emissions 
policies have contributed to the toppling of 
a prime minister. 
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Pollution could 
cause trees to 
break more easily 


discovered that their wood density has dropped between 8% 

and 12% since 1900, they report in Forest Ecology and Management. 
Rising temperatures, and the faster growth they spur, probably 
account for some of the drop. But another factor, scientists say, 

is more nitrogen in the soil from agricultural fertilizers and vehicle 
exhaust. The result? Less durable lumber and trees that may 

be less efficient at soaking up the greenhouse gas carbon dioxide. 


Nominee ducks climate query 


SCIENCE POLICY | Kelvin Droegemeier 
sidestepped the only hardball question last 
week from the Senate panel reviewing 

his nomination for director of the White 
House Office of Science and Technology 
Policy. Senator Ted Cruz (R-TX), who 

says the planet is not warming and that 
Democrats have demagogued the issue, 
asked about “empirical” satellite data that 
“show no Statistically significant warming 
over the past 18 years.” Droegemeier, an 
expert in severe storm prediction at the 
University of Oklahoma in Norman, gave a 
nonanswer: “I’m familiar with some of 
those studies ... but I don’t study climate.” 
His predecessor, John Holdren, repeatedly 
dismissed the supposed evidence for a 
“hiatus,” but some climate scientists are 
worried that Droegemeier has decided to 
remain mum on the issue as the price of 
becoming the president’s science adviser. 
He revealed some of his views in a 2014 
talk to researchers, when he described the 
planet’s resiliency by saying: “You can 
kick it in the butt really, really hard and 

it will come back.” 
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American chestnuts were once 
a dominant tree, and a major 
source of food, in the forests of 
eastern North America. 


Can a transgenic chestnut restore a forest icon? 


Researchers seek permit to release American chestnut engineered to resist a deadly blight 


By Gabriel Popkin, in Syracuse, New York 


wo deer-fenced plots here contain 

some of the world’s most highly regu- 

lated trees. Each summer research- 

ers double-bag every flower the trees 

produce. One bag, made of breath- 

able plastic, keeps them from spread- 
ing pollen. The second, an aluminum mesh 
screen added a few weeks later, prevents 
squirrels from stealing the spiky green fruits 
that emerge from pollinated flowers. The re- 
searchers report their every move to regula- 
tors with the U.S. Department of Agriculture 
(USDA). “We tell them when we plant and 
where we plant and how many we plant,” 
says Andrew Newhouse, a biologist at the 
nearby State University of New York College 
of Environmental Science and Forestry 
(SUNY ESF). 

These American chestnut trees (Castanea 
dentata) are under such tight security be- 
cause they are genetically modified (GM) or- 
ganisms, engineered to resist a deadly blight 
that has all but erased the once widespread 
species from North American forests. Now, 
Newhouse and his colleagues hope to use the 
GM chestnuts to restore the tree to its for- 
mer home. In the coming weeks, they plan 
to formally ask U.S. regulators for approval 
to breed their trees with nonengineered rela- 
tives and plant them in forests. 

If the regulators approve the request, it 
would be “precedent setting’—the first use 
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of a GM tree to try to restore a native spe- 
cies in North America, says Doria Gordon, 
lead senior scientist at the Environmental 
Defense Fund (EDF) in Washington, D.C. 
But deciding whether to unleash a GM tree 
into the wild could take years. 

American chestnuts, towering 30 meters 
or more, once dominated forests through- 
out the Appalachian Mountains. But in the 
early 1900s, a fungal infection appeared on 
trees at the Bronx Zoo in New York City, and 
then spread rapidly. The so-called chestnut 
blight—an accidental import from Asia— 
releases a toxin that girdles trees and kills 
everything above the infection site, though 


i YS 
Researchers seal off the flowers of a chestnut 
carrying a wheat gene that neutralizes a fungal toxin. 
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still-living roots sometimes send up new 
shoots. By midcentury, large American 
chestnuts had all but disappeared. 

In 1990, SUNY ESF tree geneticists 
William Powell and Charles Maynard (now 
retired) decided to try to create resistant 
chestnuts with the then-new technology 
of genetic engineering. Eventually, they in- 
serted into the tree’s genome a wheat gene 
that codes for an enzyme called oxalate 
oxidase, or OxO. It breaks down the oxalic 
acid the pathogen releases, which is what 
kills the trees. “We’re basically taking the 
weapon away from the fungus,” Powell says. 

It didn’t work at first. Then, the scien- 
tists changed the wheat gene’s promoter se- 
quence to cause OxO to be expressed at high 
levels. In 2014, they reported that a GM tree 
named Darling 58 both resisted blight in- 
fection and transmitted resistance to its 
offspring. Subsequent tests showed that it 
produces nuts indistinguishable from those 
of native trees, Newhouse says. And its pol- 
len, flowers, and decaying leaves don’t harm 
bees, beneficial soil fungi, or tadpoles that 
hatch in pools on the forest floor. 

But the request to release it is likely to 
face a lengthy regulatory road. The United 
States, China, and Brazil have approved some 
transgenic trees for use in fruit orchards, 
biofuel plantations, and afforestation proj- 
ects. But like GM crops and animals, GM 
trees are controversial, and ethical and 
ecological concerns are heightened because 
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the chestnut trees would grow wild. Regula- 
tors from three federal agencies are likely to 
take a close look at those concerns. USDA 
officials, for instance, will seek to determine 
whether the tree could become a weed or 
otherwise threaten existing plants. The Food 
and Drug Administration will study whether 
the tree’s fruit is safe to eat, and the Envi- 
ronmental Protection Agency will consider 
whether the trees’ blight-blocking enzyme 
should be regulated as a fungicide. 

Regulators also “need a really clear pro- 
cess for transparently incorporating ... 
cultural and spiritual values into the deci- 
sion-making,” says Gordon, who serves on a 
committee convened by the National Acade- 
mies of Sciences, Engineering, and Medicine 
to examine issues surrounding GM trees. The 
American chestnut was a culturally impor- 
tant tree and important food source for many 
Native Americans, and some are wary of ge- 
netically altering a species with which they 
have a long relationship, says Neil Patterson, 
a member of the Tuscarora Nation and assis- 
tant director of the Center for Native Peoples 
and the Environment at SUNY ESF. 

If the tree survives the regulatory gaunt- 
let, researchers not directly involved in its 
development are cautiously optimistic that 
it could help with restoration. It seems to 
fend off blight better than hybrids pro- 
duced so far through traditional breeding 
methods, says Jared Westbrook, chief scien- 
tist of the Asheville, North Carolina-based 
American Chestnut Foundation, which has 
spent 35 years attempting to breed a blight- 
resistant chestnut and helped fund the GM 
tree research. But to maximize survival, the 
GM trees—which are all descended from 
clones of one “founder” tree—will need to 
be crossed with trees adapted to local cli- 
mate and diseases, Westbrook says. “We’re 
not going to restore a species with a clone.” 

All that work could be undone if the fun- 
gus evolves a way around the defense, says 
Richard Sniezko, a tree geneticist with the 
U.S. Forest Service in Cottage Grove, Ore- 
gon. Powell and Newhouse doubt that kind 
of natural selection will occur, because 
their tree does not actually kill the fungus. 
Still, “None of us wants something to be 
put out there ... and it fails after 10 years,” 
Sniezko says. 

The continuing influx of insects and 
pathogens from abroad also could present 
a new chestnut killer, says Gary Lovett, an 
ecologist at the Cary Institute of Ecosystem 
Studies in Millbrook, New York. Creating 
resistant varieties “is a good thing,” he says. 
“But it doesn’t do any good if we keep intro- 
ducing new pests.” 


Gabriel Popkin is a journalist in Mount 
Rainier, Maryland. 
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New pain drugs may lower 
overdose and addiction risk 


By slowing action or targeting different receptors, altered 
opioids or alternatives aim to sidestep abuse 


By Robert F. Service, in Boston 


s the opioid crisis continues to ravage 

U.S. communities, scientists and drug 

companies have intensified their ef- 

forts to develop safer and less addic- 

tive pain medications. Now, multiple 

research groups are claiming progress 
in devising novel opioids—or alternatives— 
that seem to offer pain relief with far less 
risk of addiction or of the opioid-induced re- 
spiratory depression that all too commonly 
leads to death. 

Most of these studies, reported at a meet- 
ing here and in a paper released this week, 
have only been done in animals, so the 
experimental compounds face significant 
hurdles before they can be- 
come approved medications. 
Yet they are raising tenta- 
tive hopes among research- 
ers. “It’s encouraging,” says 
Laura Bohn, a biochemist at 
Scripps Research in Jupiter, 


“There has been 
a really big 
push to develop 


activate another intracellular protein, 
B-arrestin2, which produces respiratory 
depression and constipation, the most 
common opioid side effects for such drugs. 
Several “biased opioids,’ including one 
now under review by the Food and Drug 
Administration, offer pain relief while 
reducing f-arrestin2 activation, but it’s 
not clear whether they are less addic- 
tive than conventional opioids (Science, 
17 November 2017, p. 847). 

At last week’s meeting of the Ameri- 
can Chemical Society here, Neel Anand, 
a senior director for medicinal chem- 
istry at Nektar Therapeutics, a biotech 
firm in South San Francisco, California, 
described an approach that might help. 
Nektar’s drug, called NKTR- 
181, is a version of oxy- 
codone to which researchers 
have linked a molecular tail 
called polyethylene glycol, 
a common pharmaceutical 
strategy for extending the 


Florida. “There has been a nonopioid pain life span of medicines in the 
really big push to develop 7 blood. Anand reported that 
nonopioid pain relievers. But r eliever: S. But in animal studies, NKTR-181 
it has been really hard.” it has been crosses the blood-brain bar- 

A record 72,000 people in - rier 70 times more slowly 
the United States died last I eally har ° than oyxcodone. Instead of 
year from overdoses, up nearly —_ Laura Bohn. a sharp spike in both pain 


10% from 2016, according to 
an estimate this month by 
the Centers for Disease Control and Pre- 
vention. That rise was driven primarily by 
an increase in overdoses from highly po- 
tent synthetic opioids such as fentanyl and 
carfentanil. Another 2.1 million Americans 
are believed to regularly abuse opioids, in- 
cluding natural ones like morphine, semi- 
synthetic compounds such as oxycodone, 
and the synthetics, and have signs of addic- 
tion, such as withdrawal symptoms, if they 
try to quit. 

Opioids are powerful pain relievers be- 
cause they bind to a key cell membrane 
protein, known as the p-opioid recep- 
tor (MOR), on neurons in the brain and 
spinal cord. Once activated, the MOR 
triggers an intracellular “G protein” to 
initiate a molecular cascade that leads to 
pain relief. But traditional opioids also 
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relief and euphoria, caused 
by an upsurge of the neuro- 
transmitter dopamine in brain regions tied 
to addiction, NKTR-181 triggers a slower 
release of dopamine that produces flat- 
ter, more sustained pain relief and less 
euphoria. In clinical studies of more than 
600 patients taking the compound, Nektar 
researchers found far fewer signs of addic- 
tion than in patients given oxycodone, as 
well as fewer side effects. 

“It clearly works” as a painkiller, says 
Steven McKerrall, a medicinal chemist with 
Genentech in South San Francisco. “They’ve 
built [a timed release] into the drug itself” 
But McKerrall and others caution that opi- 
oid addicts have devised strategies to defeat 
other abuse-resistant formulations, for ex- 
ample, by crushing pills that have timed- 
release coatings. “Addicts will always find a 
way,’ Bohn says. 
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They might have a tougher time with 
a compound developed by Astraea Thera- 
peutics, a biotech company in Mountain 
View, California, that hits two brain mol- 
ecules at once. AT-121 stimulates not only 
MOR, but also a close cousin known as the 
nociceptin opioid receptor (NOR). When 
activated in the brain, NOR appears to 
counteract MOR. At the same time, it re- 
inforces MOR’s pain relieving activity else- 
where in the central nervous system, says 
Nurulain Zaveri, Astraea’s founder and 
chief scientific officer. The drug isn’t the 
first to target both receptors—another one 
is already in phase III trials for diabetic 
nerve pain, among other uses, but that 
compound targets other receptors as well, 
and animal studies suggest it may have ad- 
dictive properties. 

In this week’s issue of Science Transla- 
tional Medicine, Zaveri and academic col- 
leagues in the United States and Japan 
report that rhesus monkeys given AT-121 
experienced 100-fold greater pain relief 
than the same dose of morphine provided. 
Yet the drug did not trigger respiratory de- 
pression, addictivelike behaviors, or even 
tolerance, where more of a compound 
is needed over time to produce the same 
desirable effects such as pain relief. AT- 
121 even appears to counteract addiction 
to standard opioids, such as oxycodone, 
Zaveri says. Monkeys hooked on oxyco- 
done and trained to self-administer the 
drug sharply reduced further drug seeking 
when given AT-121. “It looks very promis- 
ing,” Bohn says of the new compound. 

Avoiding opioid receptors altogether is 
another appealing strategy for relieving 
pain with a reduced risk of addiction, says 
Roger Kroes, senior director for discovery 
science at Aptinyx, a biotech firm in Evan- 
ston, Illinois, who described one of his com- 


pany’s compounds at the meeting. Called 
NYX-2925, it activates the NMDA receptor, 
which helps strengthen neural synapses in- 
volved in learning and memory. Although 
acute pain doesn’t involve a learned compo- 
nent, chronic pain is thought to bring about 
long-term neural changes orchestrated, in 
part, by NMDA receptors. 

Many well-known drugs that block these 
receptors—among them ketamine and 
methadone—can relieve pain and can be 
less addictive than opioids. But these com- 
pounds hit other targets as well and have 
widespread side effects. NYX-2925, how- 
ever, is more selective, data show. At the 
meeting, Kroes reported that in preclinical 
studies on mice and rats, the compound 
reduced pain and led to a remodeling of 
synapses involved in learning and memory, 
essentially rewiring neural circuitry away 
from being habituated to pain. 

The results “were pretty exciting,” says 
Ben Milgram, a medicinal chemist with Am- 
gen in Cambridge, Massachusetts, who at- 
tended the meeting. Aptinyx is now testing 
NYX-2925 in two phase II clinical studies in 
people with diabetic nerve pain and fibro- 
myalgia, a disease marked by widespread 
muscle and skeletal pain. 

Drugs designed to deliver the benefits 
of opioids without the deadly risks can 
easily falter. At the meeting, researchers 
from Genentech, Merck & Co., and Amgen 
described compounds designed to tamp 
down yet another nonopioid receptor tar- 
get, a protein called Na,1.7. Although all 
found their target and reduced pain in ani- 
mals, they proved weaker on other scores; 
for example, some were poorly absorbed in 
the blood or blocked other Na, proteins, 
causing side effects. Still, with the opioid 
crisis taking an ever-larger toll, even pre- 
liminary good news is welcome. 


Synthetic opioids, such as this fentanyl captured in a drug raid, have caused an alarming rise in overdose deaths. 
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Hybridization 
may give 

some parasites 
a leg up 


Genomic study helps explain 
how schistosomiasis gained 
a foothold in Europe 


By Elizabeth Pennisi, in Montpellier, France 


nfecting an estimated 230 million 
people, schistosomiasis is the world’s 
most widespread parasitic disease after 
malaria. But temperate latitudes were 
thought to be spared: Schistosome flat- 
worms are common only in warm places 
in Africa, India, and South America. So 
parasitologist Jerome Boissier was surprised 
when, in a single week in 2014, physicians 
in France and Germany called him to report 
that two families who had never left Europe 
had developed the disease, which can cause 
fever, chills, muscle aches, and bloody urine. 
Epidemiologists later traced the cases 
to the Cavu River on Corsica, a French is- 
land in the Mediterranean Sea, where the 
patients had swum during a vacation. Sci- 
entists found that a local freshwater snail 
was serving as the intermediate host that’s 
essential to the flatworms’ complicated life 
cycle. The river is still infested: At least 
120 people have become infected. And the 
disease is turning up elsewhere on Corsica. 
In earlier work, Boissier, who’s at the Uni- 
versity of Perpignan Via Domitia in France, 
had shown that the culprit is no ordinary 
schistosome parasite, but rather a hybrid of 
two species. Now, his team has uncovered 
the hybrid’s advantage: It appears to be bet- 
ter than the parent species at infecting both 
the snails and its unfortunate mammalian 
hosts. Such hybrids, discovered in other 
parasitic species as well, could widen a para- 
site’s range of host mammals, complicating 
efforts to control it. Presented here last week 
at the Second Joint Congress on Evolution- 
ary Biology by Boissier’s grad student Julien 
Kincaid-Smith, the work “is changing the way 
we think about disease transmission,” says 
Christina Faust, a disease ecologist at the Uni- 
versity of Glasgow in the United Kingdom. 
Humans and other mammals infected 
with schistosomiasis shed eggs in their feces 
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or urine, which hatch if they reach fresh- 
water in time. The hatchlings then take up 
residence in snails, where they mature and 
reproduce asexually, yielding tiny larvae 
that exit the snail. If those larvae encoun- 
ter another swimming or wading mam- 
mal, they burrow into its skin and settle in 
blood vessels, completing the life cycle. Five 
species infect humans; the most common 
one, Schistosoma haematobium, causes uro- 
genital schistosomiasis. It often resides in 
veins in the bladder wall or the reproduc- 
tive tract and can damage organs or impair 
fertility. Although the antiparasitic drug pra- 
ziquantel is effective, patients in developed 
countries can go undiagnosed for years. 

S. haematobium probably reached Europe 
after a patient infected elsewhere traveled 
to Corsica and urinated in the Cavu River, 
Boissier says. An intermediate host was wait- 
ing: The river is home to the snail Bulinus 
truncatus—one of a few Bulinus species that 
can support schistosomes—which also occurs 
in some African and Middle Eastern coun- 
tries. The outbreak “is a wake-up call that this 
disease can establish itself wherever the right 
[conditions] exist,’ says immunologist Daniel 
Colley of the University of Georgia in Athens, 
who notes that global travel makes such in- 
troductions more likely. 

Two years ago, Boissier’s team reported 
that DNA tests on the parasite eggs suggested 
the new arrival was a hybrid of S. haemato- 
bium and S. bovis, a schistosome species that 
infects livestock; on the basis of the hybrid’s 
DNA, Senegal was the most likely source. 
The hybrids themselves were not news; 
Tine Huyse, a parasitologist at the Catholic 
University of Leuven in Belgium, and a col- 
league had found them in Senegal in 2008. 
But Kincaid-Smith traveled to Senegal and 
Cameroon to collect the parent strains, and 
the team bred them in the lab to re-create the 
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At least 120 people contracted schistosomiasis in the Cavu River; another river nearby has beco 


hybrid. The researchers then tested the abil- 
ity of the parents and hybrid to infect snails 
and—as a stand-in for humans—hamsters. 

The human parasites found in Africa didn’t 
infect the Corsican snails, Boissier reported. 
S. bovis, the animal variety, did infect the 
snails, but the hybrid did so even more read- 
ily, and it thrived not only in Corsican snails 
but also in B. truncatus snails from Spain 
and a related snail species from Portugal. The 
hybrid also developed faster in hamsters and 
made them sicker. 

Hybrids have emerged in other parasites, 
including the agents that cause malaria, 
leishmaniasis, and Chagas disease. Just how 
important they are in epidemiology is still un- 
clear, but their existence is worrisome, Huyse 
says, and they seem set to become more com- 
mon as travel and migration expand. Hybrids 
are more likely to infect multiple hosts, al- 
lowing some of them to “hide” in nonhuman 
animals, out of reach of the drugs given to 


A European foothold 
Schistosomiasis was discovered on the French 
island of Corsica in 2014; a DNA analysis suggests 
it originated in Senegal. 
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people. And combining two genomes gives a 
parasite more genetic variation with which 
to adapt to new places and hosts, Faust says. 

When Kincaid-Smith and his colleagues 
teamed up with the Wellcome Sanger In- 
stitute in Hinxton, U.K., to fully sequence 
the hybrid, they found that three-quarters 
of its DNA came from the human parasite 
and the rest from S. bovis. That mixture 
may boost the ability of the hybrid to infect 
the Corsican snail, but with a quarter of the 
genes from S. haematobium missing, “it’s 
a wonder how the parasite can still infect 
humans,” said Kincaid-Smith, who with his 
colleagues posted a preprint about the ge- 
nome study on bioRxiv on 11 August. The 
fact that DNA from the two parent species 
was quite mixed up—sections of S. bovis 
chromosomes appeared at various places 
along the S. haematobium chromosomes— 
indicates that hybrids have been around 
long enough to mate with parents and with 
each other over multiple generations. 

“The level of genomic information [in the 
study] is impressive,’ Colley says. But he’s 
cautious about extrapolating the findings 
about the infectious superpowers of the 
labmade hybrid to what happens in nature. 
“We do not know how it will play out in the 
long run in terms of worsening the spread 
of or impeding the control of schistosomia- 
sis,’ he says. 

Schistosomiasis seems set to stay on Cor- 
sica. Although no human cases occurred 
in 2017—after a total of seven cases in the 
two preceding years—the worms still occur 
in snails in the Cavu River; they also have 
surfaced in the nearby Solenzara River, 
Boissier says. Whether they overwinter in 
snails or take refuge in rodents or some 
other mammalian host isn’t clear, Kincaid- 
Smith told the meeting: “That’s also some- 
thing that needs to be investigated.” 
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RESEARCH FUNDING 


Amid fears of idea theft, NIH 
targets foreign funding links 


Agency reminds applicants to report all grant sources, 
warns against violating confidentiality of peer review 


By Jocelyn Kaiser and David Malakoff 


ears that foreign governments are 

tapping U.S.-funded research for valu- 

able information have reached the 

nation’s largest research funder, the 

National Institutes of Health (NIH) 

in Bethesda, Maryland. Last week 
it sent a letter to more than 10,000 re- 
search institutions, urging them to ensure 
that NIH grantees are properly reporting 
their foreign ties. The agency also said it 
is investigating about a half-dozen cases in 
which NIH-funded investigators may have 
broken reporting rules, and it reminded 
researchers who review grant applications 
that they should not share proposal infor- 
mation with outsiders. 

At a Senate committee hearing on NIH 
oversight last week, NIH Director Francis 
Collins said “the robustness of the biomedi- 
cal research enterprise is under constant 
threat” and “the magnitude of these risks 
is increasing,” although he did not mention 
specific incidents. He added that in addi- 
tion to sending the 20 August letter ask- 
ing institutions to help curb “unacceptable 
breaches of trust and confidentiality,’ NIH 
has established a new advisory group to 
help the agency tighten procedures. 

NIH is feeling pressure from Congress. At 
the hearing, Senate health committee chair 
Senator Lamar Alexander (R-TN) lauded 
the contributions of foreign-born scientists 
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to the United States but worried about “bad 
actors.” His comments reflect a resurgence 
of concern about foreign competitors to 
the United States—especially China, Russia, 
and Iran—attempting to harvest the fruits 
of federal investments in academic science. 
In March, federal prosecutors indicted 
nine Iranians on charges of hacking into 
the accounts of nearly 4000 professors at 
144 U.S. universities and stealing data that 
cost $3.4 billion to develop. In another case, 
a professor at Duke University in Durham, 
North Carolina, has alleged that a Chinese 
doctoral student working in his laboratory 
on materials for “cloaking” objects from 
electromagnetic waves returned to China 
with sensitive, government-funded findings 
that he used to start a succesful tech com- 
pany. Such incidents have prompted the 
Federal Bureau of Investigation to begin to 
meet with university officials to brief them 
on information security issues. 

Adding to the worries is the growing 
number of researchers who receive funding 
from—and run laboratories in—the United 
States and another nation, potentially open- 
ing a conduit for the transfer of data and 
technology. U.S.-based scientists are also be- 
ing targeted by so-called talent recruitment 
programs run by China and others, which 
offer lucrative funding. 

In general, NIH and other federal re- 
search funders don’t bar U.S.-based grant- 
ees from receiving foreign funding, often 
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sponsor research conducted jointly with 
scientists in other nations, and encourage 
grantees to freely share the results of funded 
research unless the government stamps it 
as classified. But grantees do have to inform 
the government if they patent research dis- 
coveries as well as disclose all sources of 
funding when applying for a grant. 

Collins told Science that NIH’s interest in 
the issue was prompted not by “some big 
explosive episode” involving a violation of 
such rules, but “just a gathering sense that 
it’s time to take action.” Agency officials 
have spotted NIH-funded papers noting 
foreign support that had not been properly 
disclosed to NIH itself, for instance. The 
agency is also concerned about NIH-funded 
scientists who spend several months a year 
in their home country at what Collins called 
shadow labs, making it hard to tell which 
country is funding their discoveries. NIH 
won’t name the six or so institutions that it 
is investigating, but Collins told STAT that 
the agency is concerned that some research- 
ers have hidden their foreign ties because 
they intend to share intellectual property or 
private information with other nations. But 
it “may all turn out to be fine—they forgot 
to tell us something,” he says. 

NIH also is moving to defend its peer- 
review system, which annually uses thou- 
sands of volunteer reviewers to evaluate 
more than 80,000 applications—many of 
which include unpublished findings. A par- 
ticular concern, the letter states, is “sharing 
of confidential information on grant appli- 
cations by NIH peer reviewers with others, 
including foreign entities.” 

Academic groups say they share NIH’s 
concerns. Schools “look forward to work- 
ing with NIH to identify opportunities to 
mitigate breaches and help ensure accurate 
reporting,” says Lisa Nichols of the Council 
on Governmental Relations in Washington, 
D.C., which tracks regulatory issues for uni- 
versities. They can’t always uncover prob- 
lematic foreign links on their own, adds 
Tobin Smith of the Association of American 
Universities, also in Washington, D.C. “We're 
going to have to do a better job of making 
sure that faculty are being honest,” he says. 

Some lawmakers in Congress would like to 
see stricter oversight of foreign-funded proj- 
ects on U.S. campuses. A draft amendment 
to a recent defense spending bill, for exam- 
ple, would have allowed the Pentagon to bar 
funding for U.S.-based researchers who re- 
ceived support from talent recruitment pro- 
grams funded by foreign governments. The 
provision—perceived as targeting China— 
was ultimately shelved in favor of language 
asking the Department of Defense to work 
with universities to examine the risks and 
benefits of such arrangements. 
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BIOMEDICINE 


In dogs, CRISPR fixes a muscular dystrophy 


Treatment repairs gene in beagles by further mutating it, but human trials are far off 


By Jon Cohen 


ighting fire with fire, researchers work- 

ing with dogs have fixed a genetic glitch 

that causes Duchenne muscular dystro- 

phy (DMD) by further damaging the 

DNA. The unusual approach, using the 

genome editor CRISPR, allowed a mu- 
tated gene to again make a key muscle pro- 
tein. The feat—achieved for the first time in a 
large animal—raises hopes that such genetic 
surgery could one day prevent or treat this 
crippling and deadly disease in people. An es- 
timated 300,000 boys around the world are 
currently affected by DMD. 

The study monitored just four dogs for 
less than 2 months; more animal experi- 
ments must be done to show safety and effi- 
cacy before human trials can begin. Even so, 
“T can’t help but feel tremendously excited,” 
says Jennifer Doudna of the University of 
California, Berkeley, who 
heard the results last 
week at a CRISPR meet- 
ing she helped organize. 
“This is really an indica- 
tion of where the field is 
heading, to deliver gene- 
edited molecules to the 
tissues that need them 
and have a therapeutic 
benefit. Obviously, we’re 
not there yet, but that’s 
the dream.” 

The study, which also 
appears online this week in Science, was led 
by molecular biologist Eric Olson of the Uni- 
versity of Texas (UT) Southwestern Medical 
Center in Dallas, whose team earlier had 
similar results in mice. “We wanted to put 
this to the ultimate test and see if we could 
do it in a large animal,” Olson says. The pos- 
itive findings—CRISPR quickly restored the 
protein dystrophin in critical body muscles, 
including the heart—“brought tears to the 
eyes and were jaw-dropping,” he says. 

The study offers little evidence that dogs 
regained muscle function, however, and 
that, coupled with the short duration of 
the study and the small number of animals 
studied, left some scientists less enthusias- 
tic. One researcher in the tight-knit DMD 
field who asked not to be named wonders 
whether the study was rushed to help draw 
investment in Exonics Therapeutics, a 
Boston-based company Olson launched last 
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year to develop the potential treatment. 

Olson says his team worked quickly not 
because of corporate ambitions, but rather to 
prove the concept before expanding to lon- 
ger, more thorough dog experiments that ul- 
timately are needed to launch human trials. 
The few animals initially studied, he adds, re- 
flects sensitivities about experimenting with 
dogs. “We’re very mindful of ethical concerns 
and have done our best to keep our use of 
dogs to an absolute minimum.” 

The dystrophin gene, the largest in the 
human body, contains 79 separate coding 
regions, or exons, that work together to cre- 
ate a protein that has 3500 amino acids. That 
much DNA offers a lot of opportunities for 
mutations that can cause DMD. But only one 
functional copy of the gene is needed, and 
because it sits on the X chromosome, girls 
have a backup copy. Boys with their one copy 
disabled develop walking problems early in 


Dystrophin (green) is abundant in normal dog muscle cells (left), almost absent in those of a beagle with 
a muscular dystrophy (middle), and partially restored in an affected beagle treated with CRISPR (right). 


life and die on average in their mid-20s from 
heart and respiratory failure. 

About 13% of boys with DMD have muta- 
tions in a region between exon 45 and 50, 
which bumps exon 51 “out of frame” and 
throws a wrench into the cellular machinery 
that reads the gene’s instructions, stopping 
production of dystrophin. In 2009, a team 
led by Richard Piercy at the Royal Veterinary 
College in London identified a spaniel with 
signs of DMD that had a spontaneous muta- 
tion deleting exon 50, which similarly moves 
exon 51 out of frame. They later bred a rela- 
tive of that dog with beagles, which have long 
been used in biomedical research, to create a 
colony with DMD symptoms. 

Together with Piercy’s group, Olson and 
colleagues designed CRISPR’s molecular scis- 
sors to make a cut at the beginning of exon 
51 in the diseased beagles. The team hoped 
that when the cell tried to repair the slice, it 
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would accidentally introduce errors to exon 
51, leading its proteinmaking machinery 
to skip the exon altogether and produce a 
shortened but still functional dystrophin. (A 
newly approved drug for DMD, eteplirsen, 
promotes such exon-skipping as well, but its 
efficacy remains hotly debated.) 

Another challenge was to alter billions 
of muscle cells throughout a living animal. 
So the team enlisted a helper: a harmless 
adeno-associated virus that preferentially 
infects skeletal muscle and heart tissue. Two 
1-month-old dogs received intramuscular 
injections of the virus, engineered to carry 
CRISPR’s molecular components. Six weeks 
later, those muscles were making dystrophin 
again. Those results led the researchers to 
give an intravenous infusion to two more 
dogs, also 1 month old, to see whether the 
CRISPR-carrying viruses could add the ge- 
nome editor to muscles throughout the body. 
By 8 weeks, Olson told 
the meeting, dystro- 
phin levels climbed to 
relatively high levels in 
several muscles, reach- 
ing 58% of normal in 
the diaphragm and 
92% in the heart. But 
because the dogs were 
euthanized, Olson could 
show little evidence 
that they had avoided 
DMD symptoms, save 
for a dramatic video of 
a treated dog walking and jumping normally. 

“There are a lot of questions that have 
to be addressed,” acknowledges Leonela 
Amoasii, who works in Olson’s lab at UT 
Southwestern and is director of gene edit- 
ing at Exonics. Skeletal muscle is constantly 
being replaced, so the treatment would have 
to reach its stem cells to avoid the need for 
repeated injections. Longer studies will be 
needed to make sure that the CRISPR treat- 
ment does not introduce cancer-causing mu- 
tations. Even if it safely restores the ability 
to make dystrophin, the treatment likely will 
only help boys who receive it early in life be- 
cause the muscle damage is irreversible. And 
ultimately the treatment would have to target 
many other DMD-related mutations to help 
most boys with the disease. “We have to make 
sure that we dot all the i’s and cross all the t’s 
because the implications for both DMD and 
CRISPR therapy are immense,” Olson says. 
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REPRODUCIBILITY 


Social science studies get a ‘generous’ test 


New replication effort aimed to detect effects overstated in the original reports 


By Kelly Servick 


n 2013, social psychologist David Kidd, 

then a graduate student at The New 

School in New York City, learned that 

his very first paper as lead author had 

passed peer review and would be pub- 

lished in Science. Now, Kidd’s paper, 
which suggested that reading literary fic- 
tion improves a person’s ability to intuit 
the mental states of others, has come un- 
der scrutiny again—with a less gratifying 
outcome. It is among eight studies called 
into question by a painstaking effort to 
replicate all 21 experimental social science 
papers published in Science and Nature be- 
tween 2010 and 2015. 

Called the Social Sciences Replication 
Project, it is the latest bid by the non- 
profit Center for Open Science (COS) in 
Charlottesville, Virginia, and far-flung col- 
laborators to quality check the scientific 
literature. Like its predecessors, the new 
effort found that a large fraction of pub- 
lished studies don’t yield the same results 
when done a second time. But this time, 
the five independent research teams that 
did the replications strove to give the 
studies the benefit of the doubt: They in- 
creased the statistical power of the stud- 
ies by enlisting, on average, five times as 
many participants as the originals. “This is 
an effort to be very generous,” says Brian 
Nosek, a psychologist at the University of 
Virginia in Charlottesville who co-founded 
COS and whose lab conducted five of the 
new replications. 

That may help explain why the new proj- 
ect successfully replicated 62% of the ex- 
periments, compared with 39% in a much 
larger study of papers in three psychology 
journals, which COS and collaborators re- 
leased in 2015 (Science, 28 August 2015, 
p. 910). A similar project scrutinizing eco- 
nomics studies reported in 2016 that it had 
replicated 61% of experiments. 

The current findings, published this 
week in Nature Human Behavior, seem to 
contradict the claim that studies in high- 
profile journals, which put a premium 
on groundbreaking or surprising results, 
are less reproducible than those in more 
specialized journals. But cognitive psycho- 
logist Hal Pashler at the University of 
California, San Diego, cautions that dif- 
ferences in replication rates between vari- 


836 31 AUGUST 2018 +» VOL 361 ISSUE 6405 


ous projects probably aren’t statistically 
significant. And the 62% figure “certainly 
is consistent with there being a problem” 
in the field, he says. “It seems funny that 
there’s been a drift in standards to the 
point where 62% seems very respectable.” 

The teams aimed to test the notion that 
many studies are hard to reproduce be- 
cause the claimed effect, though real, is in- 
flated. If so, a replication effort would need 
to be more sensitive to find that smaller 
positive effect, Nosek says. “We didn’t want 
low power to be an explanation for why 
some effects didn’t replicate.” 


“We didn’t want low power 
to be an explanation for why 
some effects didn’t replicate.” 


Brian Nosek, Center for Open Science 


The replication efforts, almost all of 
them designed in collaboration with 
original authors, were large enough to be 
sensitive to an effect only 75% of the size 
originally reported. If an initial replication 
attempt failed, the researchers added even 
more participants. The approach made a 
difference: Two experiments made the cut 
only after the sample sizes ballooned. 

Kidd, now a postdoctoral researcher at 
Harvard University, says the extra rigor 
makes it easier to accept the project’s nega- 
tive verdict on his study (Science, 18 Octo- 
ber 2013, p. 377). “I can’t imagine a reason 
why one would privilege the original find- 
ings over this replication.” But in commen- 
taries published alongside the study, Kidd 
and other authors defend the underlying 
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hypotheses of their papers. Kidd, for ex- 
ample, points out that the project repeated 
only one experiment from each paper, and 
in his case, it wasn’t the strongest or the 
most important. 

Even repeating just a fraction of the 
work in the papers took years and cost 
more than $200,000, with plenty of do- 
nated labor and lab time. But the new proj- 
ect also highlights a cheaper way to gauge 
a paper’s replicability. The authors asked 
roughly 200 scientists and students, most 
of them psychologists and economists, to 
guess how likely each study was to be repli- 
cated. These experts also participated in an 
online “prediction market,’ trading shares 
that corresponded to studies, which paid 
out only if the given study was replicated. 

Both approaches did well at predicting 
the outcome for individual studies, and 
they predicted an overall replication rate 
very close to the actual figure of 62%. The 
finding echoes others suggesting expert 
judgment is a highly accurate proxy for 
replication. “There is definitely some wis- 
dom of crowds going on here,’ economist 
Anna Dreber of the Stockholm School of 
Economics, a member of the replication 
research team, said in a press confer- 
ence. Anecdotal feedback from the expert 
evaluators shows they had higher confi- 
dence in the replicability of studies with 
larger sample sizes, and were more dubi- 
ous of those with surprising or counter- 
intuitive findings. 

If experts can instinctively spot an ir- 
reproducible finding, “that kind of begs 
the question of why that doesn’t seem to 
be happening in peer review,” says Fiona 
Fidler, a philosopher of science at The Uni- 
versity of Melbourne in Australia. But if 
future studies can identify and weigh the 
best predictors of replicability, reviewers 
might be given a rubric to help them weed 
out problematic work before it’s published. 

Another trend may also help tame the 
problem of irreproducible studies: the 
push in many fields for authors to share 
the design of their studies ahead of time, to 
keep them from changing their approach 
midstream in search of a flashy, statisti- 
cally significant result. The studies ana- 
lyzed here mostly predate that shift, Nosek 
says. Whether it will really boost social 
science’s track record, he says, is “the next 
big question.” & 
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THE ALZHEIMER'S GAMBLE 


Can the National Institute on Aging turn a funding 
windfall into a treatment for the dreaded brain disease? 


hen molecular biologist 
Darren Baker was winding 
up his postdoc studying can- 
cer and aging a few years ago 
at the Mayo Clinic in Roch- 
ester, Minnesota, he faced 
dispiritingly low odds of win- 
ning a National Cancer Insti- 
tute grant to launch his own 
lab. A seemingly unlikely area, however, 
beckoned: Alzheimer’s disease. The U.S. 
government had begun to ramp up research 
spending on the neurodegenerative 
condition, which is the sixth-leading 
cause of death in the United States 
and will afflict an estimated 14 million 
people in this country by 2050. “There 
was an incentive to do some explor- 
atory work,” Baker recalls. 

Baker’s postdoc studies had fo- 6 
cused on cellular senescence, the 
cellular version of aging, which had 
not yet been linked to Alzheim- 


By Jocelyn Kaiser 


tions and a controversial national goal to 
effectively treat the disease by 2025, Con- 
gress has over 3 years tripled NIH’s annual 
budget for Alzheimer’s and related demen- 
tias, to $1.9 billion. The growth spurt isn’t 
over: Two draft 2019 spending bills for NIH 
would bring the total to $2.3 billion—more 
than 5% of NIH’s overall budget. 

Such a dramatic increase in research 
funding for a disease has no precedent 
at NIH aside from the War on Cancer, 
an effort launched in 1971, and an explo- 


Catching up 
The National Institutes of Health (NIH) has dramatically ramped up 
funding for only three specific disease priorities: cancer, AIDS, and, 
most recently, Alzheimer's. 


er’s. But when he gave a drug that 
kills senescent cells to mice ge- 
netically engineered to develop an 
Alzheimer’s-like illness, the ani- 
mals suffered less memory loss and 
fewer of the brain changes that are 
hallmarks of the disease. Last year, 
those data helped Baker win his first 
independent National Institutes of 
Health (NIH) research grant—not 
from NIH’s National Cancer Insti- 
tute, which he once expected to rely on, 
but from the National Institute on Aging 
(NIA) in Bethesda, Maryland. He now has 
a six-person lab at the Mayo Clinic, work- 
ing on senescence and Alzheimer’s. 

Baker is the kind of newcomer NIH 
hoped to attract with its recent Alzheimer’s 
funding bonanza. For years, patient advo- 
cates have pointed to the growing toll and 
burgeoning costs of Alzheimer’s as the U.S. 
population ages. Spurred by those projec- 
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*Alzheimer's disease funding, which NIH began to track in 2008, 
does not include related dementias. 
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sion of AIDS funding in the late 1980s. 
With the largesse come logistical chal- 
lenges. Overworked NIH staff are scram- 
bling to review and process thousands of 
grant proposals, including those for this 
year’s $414 million bolus—a sum _ that 
equals the entire budget of some smaller 
NIH institutes—which Congress approved 
in March. 

NIA, which oversees the new funds, 
doesn’t just want to plump up existing 
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Alzheimer’s labs, says Director Richard 
Hodes. The institute is also luring investi- 
gators, such as Baker, from other fields to 
bring in fresh ideas. Many are answering 
the call. “Nearly everyone I know is put- 
ting the words ‘Alzheimer’s disease’ in their 
grants in an effort to tap into the money,’ 
says Matt Kaeberlein of the University of 
Washington in Seattle, who studies aging. 

The funding blitz targets a problem that 
looks more intractable than ever. The only 
approved drugs for Alzheimer’s don’t stop 
the neurodegeneration, but merely 
treat symptoms—and not very well. 
In the past year, several major clini- 
cal trials based on the field’s leading 
hypothesis—that reducing the level 
of B-amyloid plaques that riddle 
the brains of Alzheimer’s patients 
would halt disease progression—have 
flopped. An antibody that targets B- 
amyloid recently delivered seemingly 
promising results in a phase II trial. 
Yet given past failures for other ea- 
gerly watched compounds, many re- 
searchers remain skeptical and want 
to see a larger phase III trial. 

Those setbacks have amplified 
concerns that U.S. officials and some 
scientists have oversold the plan for 
a treatment by the middle of the 
next decade. “I am convinced that 
we are destined to fail to make the 
2025 goal and therefore look like we 
have failed at our promise,” says Alzheim- 
er’s researcher Samuel Gandy of the Icahn 
School of Medicine at Mount Sinai in New 
York City. Some researchers also worry 
about focusing so much money on just Al- 
zheimer’s. The biomedical community “has 
mixed feelings” about such targeted fund- 
ing, says biogerontologist Judy Campisi of 
the Buck Institute for Research on Aging in 
Novato, California, who wonders whether 
more should go to basic research. 
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Even Baker has qualms. “I think it is 
great that there’s all of this funding. I just 
hope it’s not at the expense of something 
interesting in the cancer realm.” 

But naysayers are few. “Overall, what 
is wrong with it? Nothing,” says bio- 
chemist Rozalyn Anderson of the University 
of Wisconsin in Madison, who studies caloric 
restriction in monkeys to slow aging and is 
now tying that work to Alzheimer’s. “It’s a 
great experiment underway: By increasing 
funding and access to resources, can we 
bring on a game-changer in research in a 
particular area?” 


A “CONFLUENCE OF FACTORS” unleashed the 
funding surge, says Sue Peschin, president 
and CEO of the Alliance for Aging Research 
in Washington, D.C. Families became more 
open about the once-hidden disease, and ad- 
vocates became savvier. In the late 1990s, the 
Alzheimer’s Association in Chicago, Illinois, 
and later other groups began to frame care 
for Alzheimer’s patients as a financial crisis 
looming as the large baby boomer population 
ages. Alzheimer’s already costs Medicare and 
Medicaid $186 billion per year, and the figure 
will balloon to $750 billion by 2050, accord- 
ing to the Alzheimer’s Association. 

Advocates also argued that Alzheimer’s 
is underfunded in the United States in com- 
parison with major killers such as cancer 
and heart disease. That’s especially true for 
AIDS, which until recently received a fixed 
10% of NIH’s overall budget—it now gets 
$3 billion per year—yet affects far fewer 
Americans. “Neurodegenerative diseases 
have historically never really had the same 
funding. In a sense this is a correction,” 
says Alzheimer’s researcher John Hardy of 
University College London. 

Those messages resonated with U.S. law- 
makers, including Senator Susan Collins 
(R-ME) and _ then-Representative (now 
Senator) Edward Markey (D-MA), who in 
1999 co-founded the Congressional Task 
Force on Alzheimer’s Disease. In 2011, they 
co-sponsored the National Alzheimer’s 
Project Act, which called for a U.S. plan to 
improve research and care for people with 
Alzheimer’s and related dementias. After 
Congress passed the bill, the Department of 
Health and Human Services (HHS), NIH’s 
parent department, outlined ambitious 
goals, the most striking being to “prevent 
and effectively treat Alzheimer’s disease by 
2025.” Some Alzheimer’s researchers had 
misgivings about the deadline, says David 
Holtzman of Washington University School 
of Medicine in St. Louis, Missouri: “I don’t 
think most thought it was realistic.” 

The Mayo Clinic’s Ronald Petersen, who 
chaired the advisory board that drafted 
the HHS plan, defends the 2025 goal: “We 
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Deadly rise 

The number of people in the United States 

with Alzheimer's disease may reach nearly 14 million by 
the middle of this century. 
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Shifting priorities 

Researchers seeking Alzheimer's drugs are choosing 
targets other than B-amyloid and tau, the proteins 
long thought to be the key to treatments. The bars 
below reflect the percentage of National Institute on 
Aging grants for basic research devoted to various 
topics in 2008 and 2017. 
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wanted to make a bold statement. Not ‘We 
hoped to make progress. That isn’t going 
to inspire anybody.” 

As more lawmakers joined the cause, 
Congress in 2015 mandated that NIH pre- 
pare a “professional judgment” budget on 
Alzheimer’s research, a wish list of needs 
to meet the 2025 target that would bypass 
the federal budget process and go directly 
to the president and Congress. Until then, 
only cancer and AIDS had enjoyed that 
special treatment. Alzheimer’s advocates 
also lobbied the administration of former 
President Barack Obama to include extra 
funding in the White House budget request, 
Peschin says. 

The lobbying began to pay off as 
early as 2012 when then-HHS Secretary 
Kathleen Sebelius held a press conference 
to announce modest increases in fund- 
ing for Alzheimer’s research. That gained 
the attention of some scientists, including 
Baker, who submitted his grant proposal to 
NIA in 2015. However, the big ramp up be- 
gan only in 2016 after Obama and lawmak- 
ers struck a deal to lift federal spending 
caps and Congress boosted NIH’s overall 
budget after a decade of stagnation. That 
fiscal year, the share of NIH money going 
to Alzheimer’s shot up 56% to $986 mil- 
lion, including $57 million for separate 
research on three related dementias, such 
as vascular dementia. By now, 3 years of 
such funding boosts have transformed 
NIA—once a midsize NIH institute and 
“almost a backwater,’ as one official put 
it on a blog—to the fifth-largest of NIH’s 
27 institutes and centers with a $2.6 billion 
overall budget. “Our continued investment 
will pay dividends for the millions of fami- 
lies affected by Alzheimer’s,” Collins said in 
a statement to Science. 

The windfall is incredible, says Eliezer 
Masliah, director of NIA’s Division of 
Neuroscience. “I’ve been in this field for 
over 30 years, and I’ve never seen anything 
like this. This is really a golden era for 
[studying] Alzheimer’s disease.” 


NOW, THE ONUS IS ON NIA and the research 
community not to waste the money. Under 
the national plan, NIH holds summits ev- 
ery 3 years to guide its Alzheimer’s efforts, 
targeting the most promising lines of re- 
search. Some 140 treatment or prevention 
trials are underway, testing both drugs 
and preventive interventions such as ex- 
ercise. The funding has supported a con- 
sortium working on novel mouse models, 
genetically engineered to mimic the com- 
mon, late-onset form of the disease. Other 
money goes to modeling the disease by ed- 
iting Alzheimer’s risk genes in neural cells 
derived from stem cells. 
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Basic researchers are exploring new hy- 
potheses. Some of NIA’s recent funding op- 
portunities invite research on alternatives 
to the long-dominant idea that B-amyloid 
deposits outside brain cells and “tangles” 
of the protein tau inside neurons are the key 
drivers of Alzheimer’s disease and the best 
treatment targets. The announcements call 
for proposals in less-explored areas, such 
as the role of protective genes, how neuro- 
degeneration affects other animal species, 
and how metabolic changes might contrib- 
ute to Alzheimer’s. “This brought in many 
people who were reluctant to submit an Al- 
zheimer’s application in part because they 
thought, ‘We're never going to do well, we’re 
going to be outsiders,” Hodes says. At a re- 
cent Senate hearing, he pointed out that of 
452 investigators who won 
new Alzheimer’s and related 
dementia grants from 2015 to 
2017, 27% were receiving their 
first independent NIH grant, 
like Baker, and 36% were es- 
tablished researchers who had 
never had NIH support for Al- 
zheimer’s. (Some had funding 
from Alzheimer’s foundations, 
however.) “We’re not just re- 
peating the things that failed 
and hoping we get a different 
result,’ Hodes says. 

Masliah says that compared 
with a few years ago, when less 
than half of NIH’s portfolio 
in Alzheimer’s was devoted to 
areas other than B-amyloid or 
tau, it’s now more than 60% for 
translational studies and about 
70% for basic research. “I do be- 
lieve there is more money avail- 
able for us to explore these other 
ideas,” says Carol Colton of Duke University in 
Durham, North Carolina, who studies inflam- 
mation as a possible cause of Alzheimer’s. She 
and others add, however, that the academics 
called on to review NIH grant proposals are 
sometimes less open-minded than NIA staff 
and nix proposals in new areas. They “need 
to catch up,’ Colton says. 

To cast an even wider net, NIA is offering 
l-year funding supplements to researchers 
already funded by NIH in other areas who 
want to add an Alzheimer’s component to 
their research. The hope is that the extra 
money will lead to full-fledged proposals. 

Alzheimer’s grants are now much easier to 
get than other NIA grants: For most Alzheim- 
er’s proposals this year, those ranked in the 
top 28th percentile by peer-review panels 
get money. For non-Alzheimer’s grants, that 
pay line is the 19th percentile. The compe- 
tition for grants is still stiff, Hodes stresses. 
After all, he notes, high-quality applications 
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for the Alzheimer’s pool of money have “in- 
creased dramatically” in the last couple years 
“as word got out.” 

NIA grantees in fields with scarcer fund- 
ing aren’t complaining, so far. Some recipi- 
ents even suggest they're benefiting because 
competitors in the field of aging are shifting 
into Alzheimer’s. “Paradoxically, the new 
funding injection could improve everyone’s 
chances of funding,’ says Duke psycho- 
logist Terrie Moffitt, a member of NIA’s 
advisory council. 


NIA HAS HAD TO BE CREATIVE to cope with 
the tide of applications for the Alzheimer’s 
bounty, agency officials say. After a crushing 
scramble to process grant proposals last sum- 
mer, this year NIA called early for proposals 


Senator Susan Collins (right), visiting a retirement home specializing in dementia 
care, co-sponsored a bill that made research on Alzheimer’s disease a national priority. 


and scheduled peer-review panels even be- 
fore it knew its final 2018 budget. Adding to 
the pressure, President Donald Trump’s ad- 
ministration imposed a federal hiring freeze 
last year that was only recently lifted at NIH. 
“T think our staff has managed heroically to 
still be doing an extremely conscientious job. 
... Where we've compromised probably is the 
quality of life of a lot of our staff?’ Hodes says. 

At the NIH Center for Scientific Review 
in Bethesda, which arranges peer-review 
panels for much of the funding, “We’re 
handling the load as best we can,” says 
acting Director Noni Byrnes. The pool of 
potential reviewers—U.S. Alzheimer’s re- 
searchers who aren’t applying for the new 
funding themselves and so don’t have a 
conflict of interest—is limited. So, for NIA- 
organized review panels, the institute is 
also using Alzheimer’s experts in Canada 
and Europe, says Ramesh Vemuri, NIA’s 
chief of scientific review. 
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Clinical trials won’t be easy to staff either. 
Clinical researchers and neuropathologists 
focused on dementia are in short supply, 
says Alzheimer’s Association Chief Science 
Officer Maria Carrillo. NIA is trying to at- 
tract them by funding fellowships. Another 
huge problem is finding enough subjects for 
trials—especially those who are at high risk 
for the disease but still without symptoms, 
the population on which some researchers 
think amyloid-busting drugs could yet work. 
NIA plans to launch a national recruitment 
strategy that includes raising awareness 
about trials. 

Looming over the massive research push 
is the 2025 goal. It was set when optimism 
ran high that drug trials based on the B- 
amyloid hypothesis would pan out, Carrillo 
and others say. But if patients 
must begin antiamyloid treat- 
ments well before symptoms 
set in, seeing clinical benefits 
could take decades, Gandy 
notes. And the chances of 
meeting the deadline by tar- 
geting a different disease 
mechanism are small; such 
treatments remain far off. 
Still, Holtzman hopes for good 
news from an antiamyloid 
treatment trial. “Something 
is likely to be approved by 
2025. It won’t be the be all, 
end all,” he says, but he hopes 
it will keep everyone moti- 
vated. “Because we don’t just 
need money from the NIH, 
we need the pharmaceutical 
industry to not drop out’—as 
Pfizer did this year when it 
announced it was abandon- 
ing Alzheimer’s research. 

Some researchers point to the mixed suc- 
cess of NIH’s other disease “wars”: AIDS 
funding hasn’t led to a cure or a vaccine, 
though it has yielded drugs that allow 
people infected with HIV to lead nearly 
normal lives. The war on cancer has led to 
treatments that are improving survival, but 
cancer remains the second-leading cause of 
death in the United States. 

That history makes former NIH Director 
Harold Varmus cautious about the 2025 goal. 
“No one denies the enormous need to make 
progress against Alzheimer’s,” he says. But, “TI 
wish a date were not attached.” 

Hodes concedes that, like real wars, dis- 
ease wars can last far longer than anyone 
imagined—or feared. But that doesn’t mean 
it was a mistake to launch an all-out offen- 
sive against Alzheimer’s disease, he says. “If 
2025 comes and we haven’t achieved all we 
wanted, I’m not going to stop there and de- 
clare failure.” 


31 AUGUST 2018 * VOL 361 ISSUE 6405 841 


810g ‘Zz Jaquwajdas uo /Hio beweoua!os'a0ua!0s//:d}j]y Wo1 pepeojumMoq 


\ 


NN 


a a 
SSS 
SSS 
Ws 


y 
y 
y) \ 
y 
y 
y 


\ aa 
SSSSSNISSN 
SS 


EVOLUTION 


Venoms to the rescue 


Insights into the evolutionary biology of venoms 
are leading to therapeutic advances 


By Mandé Holford', Marymegan Daly’, 
Glenn F. King’, Raymond S. Norton* 


enomous animals have been admired 
and feared since prehistoric times, and 
their venoms have been used to both 
benefit and impair human health. In 
326 BCE, Alexander the Great encoun- 
tered lethal arrowheads in India that, 
based on the symptoms of dying soldiers, 
were most likely laced with venom from the 
deadly Russell’s viper. By contrast, snake 
venom has been used in Ayurvedic medicine 
since the 7th century BCE to prolong life 
and treat arthritis and gastrointestinal ail- 
ments, while tarantulas are used in the tra- 
ditional medicine of indigenous populations 
of Mexico and Central and South America. 
The modern era of venom research has so far 
yielded six venom-derived drugs (7). Recent 
work has elucidated the evolutionary biology 
of venoms and provided an impressive diver- 
sity of new therapeutic drug candidates. 
Venomous organisms are ubiquitous. All 
known animal phyla contain venomous spe- 
cies. There are more than 220,000 known 
venomous animal species, or ~15% of all 
described animal biodiversity on Earth. Ven- 
omous animals inhabit virtually all marine 
and terrestrial habitats, ranging from desert 
snakes and scorpions to Antarctic sea anem- 
ones and jellyfish. However, most of their 
venoms have not been studied. For example, 
invertebrates make up more than 90% of all 
extant species, yet we know very little about 
their venoms (2). In large part, this neglect 
has been due to the lack of appropriate 
technologies for studying the tiny amounts 
of venom that can be extracted from small 
animals. However, the recent revolution in 
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omics technologies (genomics, transcrip- 
tomics, proteomics) has enabled the study of 
venoms from animals that are small, rare, or 
hard to maintain in the lab (3). 

These recent investigations of a broad 
range of animals have provided new per- 
spectives on venom diversity, evolution, and 
pharmacology (see the figure). For example, 
studies of venom in Cnidaria, the most an- 
cient venomous lineage that evolved ~500 
million years ago, suggest that there are 
differences in the tempo, mode, and nature 
of evolution in old and young venoms, with 
evolutionarily older venoms experiencing 
relatively more purifying selection (4). This 
observation necessitates reconsideration of 
commonly held ideas about toxin evolution 
based on studies of snakes and cone snails. 
The latter animals are relatively young in 
an evolutionary sense, and their venom 
toxins are still undergoing strong diversify- 
ing selection. 

Another example of how studies of ne- 
glected or understudied venomous organ- 
isms are proving beneficial is the recent 
examination of centipede venoms. The re- 
sults show both toxin radiations within gene 
families and convergent recruitment of toxin 
genes, highlighting the multiplicity of pro- 
cesses by which venom toxins arise and di- 
versify even within a single lineage (5). The 
discovery of venom in remipede crustaceans, 
the sister lineage to Hexapoda (Arthropoda), 
provides context for understanding and com- 
paring the similarities and differences in the 
venom of centipedes, spiders, and insects 
(6). Extensive research on a broad range of 
organisms is imperative to derive and test 
robust hypotheses about venom as it relates 
to species diversification and predator-prey 
interactions, and to describe the immense 
biodiversity of animals on Earth. 

The organs that produce venom add yet 
another dimension to the evolutionary pres- 


sures that govern venom phenotype and 
genotype, and as we investigate a broad 
swathe of venomous organisms, we clarify 
more of the story. For instance, in centipedes, 
the nature of the venom production facili- 
ties probably constrains venom diversity (5), 
whereas it may enhance it in cone snails (7) 
and assassin bugs (8). Centipedes modified 
the first pair of walking legs into append- 
ages (forcipules) able to deliver venom and 
evolved venom glands from cuticular dermal 
glands (5), whereas assassin bugs have two 
morphologically distinct venom glands that 
each produce entirely different venoms with 
contrasting ecological roles—one venom is 
used for predation, and the other for defense 
against predators (8). Similarly, the distal and 
proximal portions of cone snail venom glands 
can be subdivided to deliver venom toxins 
needed for predation or defense (7). Under- 
standing the bifurcation of venom use has 
important implications for deciphering how 
venom has evolved. 

The acquisition of venom is a transfor- 
mative event in the evolution of an animal, 
because it remodels the predator-prey in- 
teraction from a physical to a biochemical 
battle, enabling venomous animals to prey 
on, and defend themselves against, much 
larger animals. The ongoing evolutionary 
arms race between venomous animals and 
their prey, which can evolve resistance to 
venoms, has resulted in venoms that are ex- 
tremely complex, with toxins that seek out 
physiological molecular targets with exqui- 
site potency and selectivity (9). 

Just as venom frequently evolves by con- 
vergent evolution from a limited suite of 
common genes, mammalian resistance to 
venom relies, at least in part, on independent 
instances of modification of a limited suite 
of genes (10). There is substantial similarity 
throughout the animal kingdom in the ba- 
sic molecular structure of venom toxins and 
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their targets. The latter include neuronal ion 
channels and receptors in the case of neu- 
rotoxic venoms (most invertebrate venoms) 
and components of the blood coagulation 
cascade in the case of hemotoxic venoms 
(leech, snake, and lizard venoms). Indeed, the 
definitive pharmacological markers for many 
ion channels are venom toxins, such as tetro- 
dotoxin in the case of voltage-gated sodium 
channels and a-conotoxins for nicotinic ace- 
tylcholine receptors. Ion channels and recep- 
tors drive physiological processes involved 
in everything from seeing to breathing, and 
venom peptides are providing the tools with 
which to investigate, manipulate, and modify 
these macromolecular machines (J1). 

The specificity, potency, stability, and 
speed with which venom peptides manipu- 
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late their molecular targets make them ideal 
candidates for therapeutics. Examples of the 
value of venom peptides in guiding the devel- 
opment of human therapeutics include the 
US. Food and Drug Administration (FDA)- 
approved drugs exenatide, an antidiabetic 
peptide from the venomous Gila monster 
(Heloderma suspectum), and ziconotide, an 
analgesic peptide from the venomous cone 
snail (Conus magus). Promising new devel- 
opments for venom peptide therapeutics 
and insecticides include monomeric insulins 
found in the venom of cone snails (12); the sea 
anemone venom peptide ShK for treatment 
of autoimmune diseases (73); chlorotoxin 
from the deathstalker scorpion for imaging 
brain tumors during surgery (14); and spider 
toxins for use as eco-friendly insecticides (15). 


Over 200,000 animal 
species, such as this prairie 
rattlesnake (Crotalus 
viridis), produce venom. 


Venom research is a highly interdisciplin- 
ary enterprise, requiring studies of the biol- 
ogy and ecology of venomous organisms, 
the structure and function of venom deploy- 
ment, the biochemistry and pharmacology 
of venoms, the pathophysiological effects 
that venom induces in prey and predators, 
and the translational development of venom 
components for biomedical and biotechno- 
logical applications. As research on neglected 
or poorly studied venomous organisms gains 
momentum with advanced omics techniques, 
a substantial database of new molecules with 
novel mechanisms of action will be pro- 
duced. A major challenge facing this emerg- 
ing field will be the development of robust 
high-throughput assays for determining the 
molecular targets of these new compounds. 
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INSIGHTS | PERSPECTIVES 


Recent progress has greatly expanded 
the known molecular targets of venom tox- 
ins and highlighted the value of venoms as 
a source of pharmacological tools and drug 
leads. Similarly, the field will have to develop 
new computational methods for modeling 
the interaction of toxins and their molecular 
targets to reduce cost, labor, and guesswork 
in identifying selective venom peptides. Un- 
like small molecules, venom peptides have 
substantial barriers that must be addressed 
when considering in silico procedures (11). 
An evolutionarily informed perspective will 
help to focus venom research to leverage the 
extraordinary biochemical warfare nature 
has created to yield transformative develop- 
ments for evolutionary biology, chemical bi- 
ology, and the discovery of therapeutics and 
bioinsecticides. 
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An evolutionary tree of venomous animals 
Animal venoms are an important source of potential therapeutic compounds, but traditional methods only 
allowed study of venoms from a few species. Genomic and proteomic advances are now opening the range of 


animal species to scientific study. 


— Venomous species 
@ Studied traditionally 
@ Source of approved therapeutic compounds 
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Sponges 

Jellyfish, sea anemones 
Velvet worms 
Spiders @ 

Mites 

Scorpions @ 
Crustaceans 
Millipedes 
Centipedes 
Grasshoppers 
True bugs 

Beetles 

Ants, wasps, bees 
Flies 

Butterflies, moths 
Nematodes 
Flatworms 
Brachiopods 
Annelids @ 
Clams 

Snails, slugs @ 
Octopuses, squid 
Starfish 
Brittlestars 

Sea urchins 

Sea cucumbers 
Tunicates 
Lampreys 

Rays, sharks 
Bony fish 

Lungfish 
Amphibians 
Turtles 

Snakes, lizards @ @ 
Tuatara 

Crocodiles, alligators 
Birds 

Monotremes 
Marsupials 
Eutherians 
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SIGNALING 


From oncogenic 
mutation to 
dynamic code 


Oncogenic BRAF mutations 
can distort downstream 
signaling outcomes 


By Walter Kolch'?* and Christina Kiel’?* 


ignal transduction pathways (STPs) 

convert biochemical reactions into 

precise and reproducible biological 

outcomes. These functions are per- 

formed reliably and _ reproducibly 

against a background of noise (varia- 
tion) arising from the stochastic nature of 
biochemical reactions (J). Indeed, informa- 
tion theory analysis of STPs indicates that 
they have a limited capacity to discriminate 
information, including different ligands or 
different activation states of components 
(2). However, this discrimination is dra- 
matically enhanced by adding dynamic in- 
formation, such as signal rise time, signal 
duration, amplitude, and decay rate (3). 
The observation that differential activa- 
tion dynamics of the extracellular signal- 
regulated kinase (ERK) pathway can deter- 
mine whether rat pheochromocytoma cells 
proliferate or differentiate was reported 
more than 20 years ago (4), and evidence 
has since accumulated that STP dynamics 
control cell fate decisions. However, we are 
still struggling to understand how signaling 
dynamics is encoded and decoded and how 
pathological changes, such as the expres- 
sion of mutant proteins, affect the dynamic 
STP code. On page 892 of this issue, Bugaj et 
al. (5) make use of new tools with which to 
decipher this code and reveal how certain 
cancer-associated BRAF mutations can cor- 
rupt the dynamic STP code and trick cells 
into unlicensed proliferation. 

BRAF is a pivotal kinase in the biochemi- 
cal circuitry that controls cell proliferation 
and transformation. It links the activation 
of RAS, a group of small G proteins that 
are activated by many growth factor re- 
ceptors, to mitogen-activated extracellular 
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kinase (MEK) and ERK (6). ERK has >650 
substrates that regulate cell proliferation, 
survival, differentiation, metabolism, and 
many other biochemical processes (7). ERK 
activation dynamics control both the regu- 
lation of gene transcription (8) and periph- 
eral biochemical processes that determine 
whether a cell undergoes proliferation or 
differentiation (9). The RAS-RAF-MEK-ERK 
signaling module is altered in >50% of hu- 
man cancers because of activating muta- 
tions or overexpression of growth factor 
receptors or mutations in RAS and BRAF 
(10). Potent inhibitors of BRAF and MEK ki- 
nases have been developed to treat various 
cancer types, but their clinical deployment 
has resulted in several surprises. For ex- 
ample, RAF inhibitors cause a paradoxical 
activation of ERK because they promote the 
dimerization of BRAF with CRAF, subvert- 
ing a normally transient element of physi- 
ological RAF activation into a long-lasting 
change of ERK activation dynamics, which 
leads to resistance to RAF inhibitors (70). 
This has led to the hypothesis that drug 
mechanisms of action have to be consid- 
ered in the context of STP topologies and 
dynamics, rather than only being optimized 
for inhibiting a specific target. 

Bugaj et al. used optogenetic tools to 
switch on and off the expression of either 
RAS or BRAF with precise temporal dynam- 
ics. By modulating the frequency of activa- 
tion, they could investigate the effects of 
common RAS and BRAF mutations on ERK 
activation dynamics. Interestingly, they 
found that in cells with different oncogenic 
RAS and BRAF mutations, ERK activity was 
still dependent on optogenetic RAS activa- 
tion. Recent data indicate that oncogenic 
RAS mutants can still transition between 
active and inactive conformations and thus 
could maintain responsiveness to growth 
factor stimulation (17), whereas the activity 
of the BRAF mutants studied by Bugaj e¢ al. 
was previously deemed to be RAS-indepen- 
dent (12). If confirmed, this suggests that 
in general, oncogenically activated proteins 
are not deadlocked into a constitutively 
active state but are still subject to regula- 
tion. This could change the way we design 
combination therapies by introducing the 
simple principle that inhibiting upstream 
activators of an oncogenic protein will aug- 
ment the efficacy of drugs targeting the on- 
cogenic protein itself. 

Additionally, the common BRAF-V600E 
mutant produced fast-responding pulses of 
ERK activation, as occurs when BRAF is not 
mutated, in response to optogenetic activa- 
tion of RAS. By contrast, oncogenic mutations 
in the adenosine triphosphate (ATP)-bind- 
ing loop (P-loop) of BRAF induced delays 
in both activation and deactivation of ERK, 
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Cancer mutations can distort the dynamic signaling code 
Oncogenic mutations were thought to lock oncoproteins in an active state. It is now appreciated that protein- 
protein interactions (PPIs) and network context cause dynamic changes in the outcome of signaling. 
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causing pulses to merge into sustained ERK 
activity at higher frequencies of RAS stimu- 
lation. Using various RAF and MEK inhibi- 
tors and an optogenetic BRAF activation 
system, the authors mapped the source of 
this reduction of kinetic resolution to BRAF 
itself. They suggest that the requirement for 
P-loop mutants to dimerize with and signal 
via activated CRAF blurs and broadens the 
ERK response kinetics and leads to changes 
in early gene expression and proliferation. 
Importantly, this finding ties an oncogenic 
mutation to alterations in signaling dynam- 
ics. This replaces the traditional view that 
mutations are off-on switches with a more 
nuanced picture in which mutations distort 
the dynamic STP code (see the figure). 

It will be interesting to examine how the 
frequency of RAS activation pulses affect 
the kinetics of BRAF-CRAF heterodimeriza- 
tion, and how P-loop mutations selectively 
modulate these kinetics. P-loop mutations 
diminish MEK binding to BRAF (13), which 
may retard ERK activation. The P-loop is 
also the target for autoinhibitory phosphor- 
ylation, which is abolished by P-loop muta- 
tions (74). The lack of autoinhibition could 
plausibly explain the slower signaling decay 
of BRAF P-loop mutants. Understanding the 
mechanism underlying the signal distortion 
in cells expressing these mutants will be im- 
portant to gauge the role of such dynamic 
alterations in malignant transformation. 
BRAF-V600E does not affect the resolution 
of activated RAS signaling in terms of ERK 
activation dynamics, yet it is a potent on- 
cogene, suggesting that altered signaling 
dynamics may contribute to transformation 
by certain mutations but not by others. 
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PPIs change activation states 

Certain oncogenic BRAF mutations can alter ERK 
activation dynamics by altering PPls, which can be 
rectified by pathway inhibitory drugs. 
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These intricacies highlight that muta- 
tions do not just statically affect STPs; they 
can have wide repercussions on functional 
features of STPs, including signaling dy- 
namics and protein-protein interactions 
that change pathway configurations. This 
view has wide ramifications for under- 
standing drug resistance. For instance, the 
mechanism of kinase dimerization convey- 
ing drug resistance is a general principle if 
the kinases can allosterically activate each 
other (15). In a broader context, mutations 
in proteins such as RAS that can connect to 
many STPs may cause a profound rewiring 
of both the topologies and signaling dynam- 
ics of downstream effector pathways. This 
viewpoint also helps rationalize why target- 
ing mutated proteins with potent inhibitors 
has frequently resulted in unexpected re- 
sponses or poor clinical efficacy. It is time to 
move the crosshairs of drug discovery away 
from single molecules and toward consid- 
ering the functional dynamic context that 
potential targets are embedded in. 


REFERENCES 


1. V.Shahrezaei, P. S. Swain, Curr. Opin. Biotechnol.19, 369 
(2008). 

2. R.Cheongetal., Science 334, 354 (2011). 

3. J.Selimkhanov et al., Science 346, 1370 (2014). 

4. C.J. Marshall, Cell 80, 179 (1995). 

5. L.J. Bugaj etal., Science 361, eaa03048 (2018). 

6. D.Matallanas et al., Genes Cancer 2,232 (2011). 

7. E.B.Unaletal., FEBS Lett. 591, 2607 (2017). 

8. T.Nakakukietal., Cel! 141,884 (2010). 

9. A.vonKriegsheim etal., Nat. Cell Biol. 11,1458 (2009). 
10. M.Holderfield et al., Nat. Rev. Cancer 14, 455 (2014). 
ll. S.Luetal.,Chem. Rev.116, 6607 (2016). 

12. Z.Yaoetal., Cancer Cell 28,370 (2015). 

13. J.R.Haling et al., Cancer Cell 26, 402 (2014). 
14. M.Holderfield et al., Cancer Cell 23, 594 (2013). 
15. B.N.Kholodenko, Cell Rep. 12, 1939 (2015). 


10.1126/science.aau8059 


31 AUGUST 2018 « VOL 361 ISSUE 6405 845 


8LOz ‘o€ IsnBny uo /bio Hewadua!dsa0uaI0S//:d}}y Wo papeojuMOG 


INSIGHTS | PERSPECTIVES 


AGRICULTURE 


Insect threats to food security 


Pest damage to crops will increase substantially in many 
regions as the planet continues to warm 


By Markus Riegler 


lobally, one out of nine people suf- 

fers from chronic hunger, and under- 

nourishment is growing (J). Global 

average surface temperatures are 

also rising and are projected to in- 

crease by 2° to 5°C this century, with 
negative impacts on agricultural production. 
Even today, despite substantial plant pro- 
tection efforts, about one-third of crops are 
lost to insect pests, pathogens, and weeds. 
How will climate warming affect these crop 
losses on a global scale? On page 916 of this 
issue, Deutsch et al. (2) evaluate the impact 
of rising average surface temperatures on 
yield losses due to insects in wheat, maize, 
and rice, which are staple foods for billions 
of people. The results show that insects will 
cause significantly increased grain loss across 
many regions of a warmer world. 

Consumption by insects—and thus dam- 
age to crops—is directly linked to insect me- 
tabolism and population size, both of which 
generally increase with temperature. How- 
ever, insect metabolism and demographics 
vary seasonally, latitudinally, and across in- 
sect species. Deutsch et al. capture some of 
this complexity with a spatially explicit insect 
population metabolism model, which inte- 
grates data on temperature-driven changes 
in metabolic and growth rates that were 
previously collated for 38 insect species from 
different latitudes. The authors estimate ad- 
ditional grain yield losses of 10 to 25% per 1°C 
of global warming. Projected losses are high- 
est in the temperate regions of the Northern 
Hemisphere that produce most grain, as a re- 
sult of increases in insect population growth 
and metabolic rates. In the tropics, projected 
yield loss due to warming is less acute be- 
cause elevated insect metabolic rates will be 
offset by reduced insect population growth 
rates. However, tropical regions already expe- 
rience the highest levels of yield loss. 

Many previous studies have investigated 
temperature responses of pest insects, and 
some have predicted how pest populations 
affect yields in the context of climate change 
(3). However, these studies were mostly on 
individual crop types at local or regional 
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scales. Few have attempted to link tempera- 
ture responses of pest insects and the dam- 
age they cause under warming more widely, 
and never on a global scale. 

Insects are highly sensitive to environmen- 
tal temperatures; they can display a variety of 
responses to climate warming (3). The input 
data for the grain loss model derive mostly 
from laboratory studies undertaken at dif- 
ferent constant-temperature regimes (4), yet 
thermal fitness can vary across life history 
stages and be modulated by insect thermo- 
regulatory behavior. According to sensitivity 
tests undertaken by Deutsch e¢ al., the result- 
ing uncertainty does not affect the overall es- 


Acaterpillar of the rice leaf folder (Cnaphalocrocis 
medinalis) infects a rice plant. Because of limited data, 
Deutsch et al. could not include this and many other 
pest species of wheat, maize, and rice in their model. 


timates of global crop loss increases. Further 
research is required to verify the results for 
the many pest species of wheat, maize, and 
rice (see the photo) that Deutsch et al. could 
not include because of insufficient data. 
Insect responses to climate change are 
further complicated by factors not included 
in the study. For example, many insect pests 
are vectors of plant pathogens that also cause 
crop losses. Predictions based on population 
growth and metabolic rates may thus under- 
estimate crop damage due to insect vectors 
under global warming. Scientists should also 
aim to better understand the impact of geo- 
graphic range expansions, human-assisted 
introductions, and biological invasions of 
pest species in a warming world (5). 
Additional intricacies arise from the fact 
that climate warming also affects natural 
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enemies of pests and that pests are exposed 
to indirect plant-mediated effects involving 
changed plant nutrient content and defense 
(6). Some responses may be compounded by 
interactive effects of warming with other fac- 
tors, such as changing rainfall patterns and 
increasing atmospheric CO, concentrations. 
Furthermore, the evolutionary adaptation 
potential and the role of plasticity in insect 
responses to climate change remain poorly 
understood for field insect populations (7). 
Global warming also involves more frequent 
and intense temperature extremes. More 
common heatwaves may have larger impacts 
than average temperature increases (8). 

The substantial increases in pest damage 
forecast by Deutsch et al. call for action on 
climate change mitigation and adaptation. 
Everyone must be involved in change: farm- 
ers, industries, policy-makers, and the wider 
society. Farming communities are already 
adapting, for example, in choosing which 
crop varieties to plant when and where. There 
is also an increased need to focus on plant 
protection, particularly given that many in- 
secticides are being banned over human and 
environmental health concerns. Reinvigora- 
tion of debates about sustainable farming 
systems (9), including the role of genetically 
modified crop plants (JO), is essential. 

Humanity faces this food security chal- 
lenge at a time when training and job op- 
portunities for expert entomologists are 
shrinking (17). These experts are urgently 
needed to deal with insect pest problems 
and, ironically, with concurrent threats to 
insect biodiversity. Research will also require 
the interplay of empirical and theoretical ap- 
proaches because model-based predictions 
can only be as good as the data that feed 
into them. Ecoinformatics (72) and the use 
of big data to answer ecological questions 
such as those posed by Deutsch e¢ al. hold 
great promise to unravel large-scale system 
responses, and this improved understanding 
is necessary to develop and implement adap- 
tation strategies. 
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VIROLOGY 


The reemergence of yellow fever 


Since 2016, yellow fever outbreaks have become a major 


public health concern 


By Alan D. T. Barrett 


ellow fever is a viral hemorrhagic fever 

with a case fatality rate up to 50%. It 

is caused by yellow fever virus (YFV), 

a mosquito-borne flavivirus that is 

related to dengue and Zika viruses. 

Despite an effective vaccine (17D), 
the virus still causes major outbreaks, as 
occurred in Brazil between December 2016 
and March 2018 where there were >2000 
confirmed cases, including >500 deaths, as 
well as >4000 epizootics (yellow fever in 
nonhuman primates) (J). On page 894 of 
this issue, Faria et al. (2) provide a genetic 
investigation of the outbreak in Brazil from 
December 2016 to October 2017, demonstrat- 
ing the origins and movement of YFV during 
the outbreak. They determined that the out- 
break originated in northeastern Brazil and 
moved southward to areas where the virus 
had not been found previously. Surprisingly, 
YFV moved at a rate of 4.25 km/day, which 
probably explains the magnitude of the out- 
break. Modeling infectious disease outbreaks 
with phylogeographic tools (based on the 
geographic distribution of viruses according 
to viral genome sequence) as well as phylody- 
namic tools (based on the interaction of epi- 
demiologic, immunologic, and evolutionary 
factors in viral genetics) has played a critical 
role in understanding outbreaks and devel- 
oping public health countermeasures. 

To date, there has been little modeling 
of YFV outbreaks because few isolates have 
been available to study, and we have instead 
relied on vaccination strategies. The study 
of Faria et al. demonstrates the potential of 
mapping viral incidence and spread almost in 
real time and their potential to contribute to 
control strategies, such as the current World 
Health Organization (WHO) program called 
Eliminate Yellow Fever Epidemics (EYE) that 
aims to eliminate the disease by 2026 (3). 

However, the investigation of yellow fever 
outbreaks in real time is not straightforward. 
YFV has a sylvatic, or forest, transmission cy- 
cle involving tree hole-breeding mosquitoes 
and nonhuman primates. Human cases usu- 
ally occur in forested areas; hence, other than 
during large outbreaks, it is difficult to obtain 
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virus samples for analysis (4). For these rea- 
sons, until June 2016 there were only 42 YFV 
genomic sequences available for study. Today 
this has increased to ~135 genomes, with 
most of the additional sequences coming 
from the outbreak in Brazil. YFV is found in 
44 countries in sub-Saharan Africa and tropi- 
cal South America, and seven virus genotypes 
have been identified (5, 6); many additional 
genomic sequences are needed to understand 
virus activity within the geographic range of 
the virus, especially because RNA viruses 
such as YFV continually evolve. Nonethe- 
less, it is clear that the capabilities are now 
becoming available as the genome database 
expands over time and space. 

WHO, the Pan American Health Organiza- 
tion, and related sponsors have been success- 
ful at controlling yellow fever through mass 
vaccination campaigns such that there were 
no outbreaks in West Africa in 2015, a region 
where historically the most cases of disease 
have been recorded. However, dramatic in- 
creases in yellow fever incidence have oc- 
curred recently, and they are occurring in 


areas that had been considered free of yellow 
fever (see the figure). 

The most dangerous form is urban yellow 
fever, where the transmission cycle involves 
domestic Aedes aegypti mosquitoes and hu- 
mans. Urban yellow fever occurred in 2016 in 
both Angola and the Democratic Republic of 
Congo (DRC), and 11 Chinese workers infected 
in Angola returned to China, where they de- 
veloped yellow fever (7). This was the first 
time that any cases of yellow fever had been 
reported in Asia, but these were imported 
cases and there were no secondary cases in 
China. Furthermore, there was a concurrent 
outbreak in Uganda caused by a virus geno- 
type different from that in Angola (8). World 
vaccine supplies were exhausted twice dur- 
ing early 2016, and a dose-sparing regimen of 
17D had to be used whereby one-fifth of a full 
dose was administered (9, 10). The concur- 
rent clinical trial of dose-sparing vaccination 
has shown that the immune response is not 
inferior to a full dose 4 to 5 weeks after im- 
munization (JO). The epidemic in Africa was 
controlled by September 2016. Unfortunately, 
yellow fever was then reported in Brazil in 
December 2016 (1). Again, 17D supplies were 
exhausted and dose sparing was used to im- 
munize 24 million individuals in Brazil. 

Whereas urban yellow fever outbreaks 
periodically occur in Africa, such as in An- 
gola and the DRC in 2016, they are rare in 
South America. Faria et al. provide per- 
suasive evidence that the recent Brazilian 


Yellow fever outbreaks 


In 2016-2018, there have been numerous outbreaks of suspected and confirmed cases of yellow fever, 
resulting in imported cases to other countries (dashed arrows), but these did not result in secondary cases. 
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outbreak was due to the forest yellow fever 
transmission cycle and not the urban cycle. 
The last documented large urban yellow fe- 
ver outbreak in the Americas was likely in 
Brazil in 1928; subsequent urban yellow fe- 
ver outbreaks in South America have been 
very small, each involving no more than 
nine cases. This suggests that the epidemi- 
ologies of urban yellow fever in Africa and 
South America are different and require 
further investigation. 

The exhaustion of vaccine supplies on 
multiple occasions, together with the resur- 
gence of YFV activity in the past 3 years, is 
a cause for concern. There were many out- 
breaks between 2016 and 2018, and in addi- 
tion, international travelers caused multiple 
importations into countries outside these 
outbreaks (see the figure). Notably, the Brazil 
outbreak resulted in travelers transporting 
YFV to seven countries during 2018, includ- 
ing five in Europe (JJ, 12). By comparison, 
only three cases were imported into Europe 
during the previous 16 years. Clearly, we can- 
not rely on YFV control by vaccination alone, 
and modeling is a critical component. 

There are reasons to be optimistic. Ad- 
vances in genomic sequencing technology of 
YFV isolates have enabled modeling of YFV 
activity, which has been routinely undertaken 
for other pathogens such as influenza virus. 
As the database improves, so will our under- 
standing of YFV movement and our ability to 
identify areas where the virus has potential 
to cause outbreaks. Concurrent activities of 
the EYE strategy will result in production of 
vaccine that can be appropriately distributed 
with input from YFV modeling. Therefore, it 
may be possible to eliminate yellow fever epi- 
demics by 2026, as planned. 
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Fusion oncogenes—genetic 


musical chairs 


Ewing sarcoma—driver fusion genes can result 
from complex genomic rearrangements 


By Marcin Imielinski*? and Marc Ladanyi? 


he cytogenetic definitions of many 
cancers predate the genome sequenc- 
ing era. Indeed, some classes of can- 
cers (largely subtypes of sarcomas, 
lymphomas, and leukemias) have 
long been defined by simple and dis- 
tinct patterns of chromosomal changes, or 
karyotypes, that, in many cases, feature a 
single pathognomonic somatic transloca- 
tion of two genomic regions that creates a 
fusion oncogene (for example, the Philadel- 
phia chromosome translocation in chronic 
myelogenous leukemia results in the BCR- 
ABLI fusion oncogene) (7). Whereas many 
common cancers display genomic complex- 
ity consistent with multistep oncogenesis, 
such as carcinomas of breast and lung, can- 
cers that are defined by translocations typi- 
cally display simple karyotypes, suggesting 
that they were shaped by a single translo- 
cation. However, the cytogenetic simplicity 
of these cancers may mask more complex 
genomic events. On page 891 of this issue, 
Anderson et al. (2) report whole-genome 
sequencing (WGS) of 50 Ewing sarcomas 
(EWSs), an aggressive sarcoma that is de- 
fined by fusion between the EWS RNA bind- 
ing protein 1 (EWSRI) gene on chromosome 
22 and an E26 transformation-specific (ETS) 
family transcription factor gene, either FLI 
at 11q24 or ERG at 21q11 (3). Anderson et al. 
show that ~40% of EWSRI-FLI fusions and 
all EWSRI-ERG fusions arise via a complex 
rearrangement pattern called chromoplexy, 
which was first identified in prostate cancer 
(4). They suggest that chromoplexy “bursts” 
may be early initiating events in Ewing sar- 
comagenesis and mark a more aggressive 
form of the disease. 
Whereas standard reciprocal transloca- 
tions involve DNA breaks in two fusion 
partners, chromoplexy involves three or 
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more breakpoints in the genome. Like the 
children’s game of musical chairs, in which 
players are forced to stand up and find a new 
seat, three or more broken chromosome ends 
are forced to find a new partner. Unlike musi- 
cal chairs, during which one of the chairs is 
removed at each round of play, every broken 
end finds a new partner, resulting in a loop 
pattern (see the figure). Chromoplexy is thus 
a complex means to an end: the formation of 
functional EWSRI-FLIT or EWSRI-ERG fu- 
sions that, upon expression, provide a selec- 
tive growth or survival advantage. 

Although EWSRIL-ERG fusions were known 
to require more complex rearrangements 
because their opposing chromosome orien- 


Fusions via chromoplexy 

Bursts of complex rearrangements can generate 
Ewing sarcoma oncogenic fusion genes and are 
associated with genomic complexity, more frequent 
TP53 mutations, and increased risk of relapse. 
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tations preclude fusion-gene formation via 
a simple reciprocal translocation, the discov- 
ery by Anderson et al. of chromoplexy as the 
underlying genomic process helps to clarify 
this. However, that such a high proportion 
of EWSRI-FLI fusions also arise by chromo- 
plexy was not fully appreciated. 

Notably, Anderson et al. provide evidence 
to suggest that chromoplexy-associated rear- 
rangements occur simultaneously, presum- 
ably in the EWS cell of origin, the nature of 
which remains unclear. A fascinating ques- 
tion is whether these chromoplexy events 
can be linked to specific features of three- 
dimensional chromatin organization and/ 
or of the transcriptional state of chromatin 
regions involved in chromoplexy rearrange- 
ments in this hypothetical cell type, given 
that they appear to be enriched in early rep- 
licating and transcriptionally active genomic 
regions, which may be prone to breakage as 
they are exposed during transcription. Gene 
pairs involved in fusions are often in close 
proximity in interphase nuclei, regardless 
of their chromosomal location (5), and this 
higher-order contiguity can be induced by 
specific transcription factors (6). A better un- 
derstanding of the transcriptional regulation 
of the genes recurrently involved in chromo- 
plexy-derived EWSRI-FLU and EWSRI-ERG 
fusions might reveal the cell state or lineage 
in which EWS arises. 

Given the young age of many EWS patients, 
one may speculate what exogenous or endog- 
enous mutagen could be responsible for such 
a mutational “burst.” Although radiation 
is a likely suspect in any disorder involving 
multiple chromosomal breaks, endogenous 
mutagens such as transposases and cytidine 
deaminases have also been linked to com- 
plex somatic rearrangements. Could EWS 
chromoplexy events be linked, for example, 
to the activity of an aberrantly expressed 
endogenous transposase such as PiggyBac 
transposase 5 (PGBD5), which was recently 
implicated in the genesis of the pathogenic 
gene rearrangements in childhood malignant 
rhabdoid tumors (7)? An alternative pos- 
sibility is a constitutional or acquired DNA 
repair defect (8). Analysis of the sequence 
context surrounding chromoplexy breaks 
may provide clues and potentially point to a 
therapeutic vulnerability that could be used 
to treat EWS. Furthermore, perhaps EWS 
arising from chromoplexy may be responsive 
to immune checkpoint inhibition, given the 
preference of chromoplexy events for tran- 
scriptionally active regions that should result 
in multiple fusion transcripts, most of which 
are likely to be out of frame (except the driver 
fusion gene). Frameshift alterations repre- 
sent an especially rich source of neoantigens 
(9), which can predict response to immune 
checkpoint inhibition. 
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Although Anderson et al. qualitatively 
validate a subset of chromoplexy events 
observed through WGS with spectral karyo- 
type data, a comprehensive bridging of 
the divide between such large-scale views 
of chromosome structure and the detailed 
views of WGS will require the application of 
long-range WGS approaches using linked- 
read or proximity-ligation short-read se- 
quencing and long-read sequencing, which 
uses more expensive and lower-throughput 
technologies to achieve read lengths that 
exceed 10 kilobase pairs. The long-range re- 
construction of highly rearranged loci can 
yield insight into both the mutational pro- 
cesses generating complex structural vari- 
ants and the consequences of these variants 
on DNA sequences (J0). 

The findings of Anderson et al. raise im- 
portant clinical questions. The contribution 
of genetic analysis to the current standard 
of care for EWS is limited to confirmation of 
the diagnostic EWSRI-FLI or EWSRI-ERG 
fusions. The discovery of genomic patterns 
associated with subsets of EWSs raises the 
question of whether additional molecu- 
lar diagnostic modalities are warranted. If 
chromoplexy events are important clinical 
biomarkers for EWS disease aggressive- 
ness, as the authors suggest, their findings 
may support a new indication for clinical 
WGS. However, additional analysis of more 
patient samples will be needed to confirm 
that the presence of chromoplexy is an inde- 
pendent prognostic predictor in EWS. This 
is because Anderson et al. find that chromo- 
plexy-driven EWS more likely contains tu- 
mor protein 53 (7P53) mutations. Because 
TP53 and stromal antigen 2 (STAG2) mu- 
tations and genomic complexity have each 
been associated with more aggressive EWS 
(11-13), dissecting the contribution of these 
factors to poor clinical outcomes in chro- 
moplexy-derived EWS will be an important 
area of future work. More generally, the 
work of Anderson e¢ al. has important clini- 
cal implications for the genomic diagnosis 
of these and other cancers, as well as the 
expanding biological role of complex rear- 
rangements in cancer evolution. 
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ORGANOMETALLICS 


18 electrons 


and counting 


The bonding rule for 
transition metal complexes 
now extends to alkaline 
earth octacarbonyls 


By P. B. Armentrout 


he “octet rule” is based on the stabil- 
ity afforded to species with closed- 
shell electron configurations like the 
noble gases. Simple “second-row” com- 
pounds like methane, ammonia, and 
water have eight electrons surround- 
ing the central atom, as do their third-row 
analogs (silane, phosphine, and hydrogen 
sulfide). For atoms in the fourth row of the 
periodic table and beyond (principal quan- 
tum number designated by 7), the single ns, 


“Intriguingly...the M(CO), 
species do not actually have 
a singlet spin state... 
[rjather...a triplet spin state...” 


three np, and five (nm - 1)d orbitals must be 
filled with two electrons per orbital, result- 
ing in an analogous “18-electron” rule for a 
closed shell. This simple electron counting 
guides inorganic chemists working with tran- 
sition metals in predicting stable compounds, 
just as the octet rule guides organic chemists 
working with carbon. For example, the stable 
transition metal carbonyls Cr(CO),, Fe(CO),, 
and Ni(CO),, as well as heavier homologs, 
can be formed, indicating that CO is a two- 
electron donor. Metals with an odd number 
of valence electrons must double up with a 
metal-metal bond, so Mn,(CO),, and Co,(CO), 
form in order to satisfy the 18-electron rule. 
On page 912 of this issue, Wu e¢ al. (2) demon- 
strate that the 18-electron guiding principle 
is not only limited to transition metals but 
can also be extended to nearby elements, the 
alkaline earths. 

Alkaline earth metals (Ms; Be, Mg, Ca, Sr, 
and Ba) have a valence electron configuration 
of ns? and generally form two covalent bonds 
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with other elements, for example, MgO and 
BaBr,. Grignard reagents RMgX, where R is 
generally an alkyl group and X is a halide, are 
common reagents in organic and organome- 
tallic synthesis. In solution, alkaline earths 
are readily oxidized and normally found in 
their +2 oxidation state. Surprisingly, Wu et 
al. show that the octacarbonyls of the heavier 
alkaline earths—Ca, Sr, and Ba—can be 
formed in the neutral state as a consequence 
of surrounding the two-electron M atoms 
with another 16 electrons. 

Metal carbonyls have been known since the 
late 19th century (2) and are used as starting 
materials for transition metal compounds in 
organic synthesis and as catalysts in hydro- 
formylation. At a fundamental level, metal 


electron-counting procedure noted above, 
in which the CO ligand donates its pair 
of electrons into an empty orbital on the 
metal, provides one binding motif, called o 
donation (after the symmetry of the bond 
being formed; see the figure, right). Because 
the HOMO is largely a nonbonding orbital 
on CO, o-bond donation does not greatly af- 
fect the C=O stretching frequency (3). 
However, CO ligands are actually more 
promiscuous in their binding. The lowest 
unoccupied molecular orbitals (LUMOs) 
on CO are the two antibonding 7* orbit- 
als (see the figure, left). Transition metals 
can utilize electrons in d orbitals with the 
same 7 symmetry to augment the binding 
to the CO ligand, in essence, forming a sec- 


Building an unexpected bond 


Alkaline earth carbonyls reported by Wu et a/. can be understood from simple bonding concepts. 


Bonding in carbon monoxide 

The molecular orbital diagram for the formation of carbon 
monoxide from carbon and oxygen atoms is shown. 
Several of the molecular orbitals are shown to the right. 


e 
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Energy level diagram 


carbonyls can be used to assess the relative 
strength of binding interactions in coordina- 
tion complexes. The highest occupied mo- 
lecular orbital (HOMO) in CO corresponds to 
a lone pair of electrons on the C atom (see 
the figure, left), so CO binds to metals at 
the C atom, except in extraordinary circum- 
stances. Furthermore, this orbital can donate 
two electrons to the metal center. In the CO 
triple bond (one o and two 7 bonds), four of 
the six electrons come from the more electro- 
negative O atom. The dipole moment of CO is 
relatively small (0.122 debye), and counterin- 
tuitively, the C atom has the negative charge. 
However, as the C=O bond stretches, the 
electrons follow the more electronegative O 
atom, and the dipole moment of CO increases 
dramatically. Thus, CO molecules “light up” 
in infrared (IR) spectroscopy because they 
have a large change in their dipole moment 
upon stretching. 

This property of the CO ligand can then 
be used to assess how it binds to metals. The 
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Bonding orbital 


Creating alkaline earth carbonyls 

The bonding interactions between an alkaline 
earth metal (M) and carbon monoxide are shown. 
Vertical arrows indicate electrons in both parts. 
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ond or even a third bond between the metal 
and the carbon (see the figure, right). This 
type of electron-exchange bonding is often 
referred to as the Dewar-Chatt-Duncanson 
model (4, 5), although these authors actu- 
ally addressed similar o and 7 interactions 
between metals and olefins. The strength 
of these “backbonding” interactions can be 
assessed by measuring the C=O stretching 
frequency. As electrons are donated into 
the antibonding LUMOs, the C=O bond 
becomes weaker and its vibrational fre- 
quency is lowered. For example, in Ni(CO),, 
the stretch shifts down to ~2060 cm from 
the free CO at 2143 cm“. The shift increases 
for isoelectronic anionic analogs that back- 
bond more effectively, as in Co(CO), at 
~1890 cm" and Fe(CO),”- at ~1790 cm” (6). 

Wu et al. generated complexes of Ca, Sr, 
and Ba with CO in a cold (4 K) neon ma- 
trix that allowed weakly bound species to 
form (7). They interrogated the matrix us- 
ing IR irradiation, finding single intense 
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absorptions at 1987, 1995, and 2014 cm 
for saturated carbonyl complexes of Ca, Sr, 
and Ba, respectively, as well as other ab- 
sorptions for smaller complexes. Because 
only a single C=O stretching frequency 
was observed, the absorbing species must 
have high symmetry, assigned as cubic O, 
for M(CO),. 

These frequencies indicate that substantial 
backbonding occurs that gradually decreases 
with increasing metal size. Backbonding is 
critical in the formation of these complexes. 
The approach of CO to one of the Ms should 
entail a repulsive interaction between the oc- 
cupied ns? orbital on the M and the HOMO 
of CO. To avoid this difficulty, the Ms must 
empty seven valence orbitals [one ns, three 
np, and three of the (m — 1)d orbitals], which 
permits strong o donor bonds with the CO 
ligands. The eighth “bonding” orbital needed 
for eight ligands is purely ligand based and 
has a,, symmetry (a type for which there is 
no atomic orbital on the M). The two M va- 
lence electrons then occupy the remaining 
two (n — 1)d orbitals (having e, symmetry), 
which augment the bonding by backbonding 
interactions, as demonstrated by the C=O 
stretching frequencies measured. 

Intriguingly, because these two orbitals 
are isoenergetic, the M(CO), species do not 
actually have a singlet spin state in which 
all the electrons are paired (which one or- 
dinarily associates with stable 18-electron 
complexes). Rather, each of the two (z - 1d 
orbitals contains a single electron, yielding 
a triplet spin state, as confirmed by quan- 
tum chemical calculations. Wu et al. also 
examined the cationic analogs of M(CO), 
complexes in the gas phase. Evidence for a 
saturated M(CO),* complex (now a 17-elec- 
tron species) was obtained by observation 
of a single C=O stretch, whereas M(CO),* 
exhibited a band characteristic of a CO li- 
gand in a weakly bound second ligand shell. 

The study of Wu ez al. challenges previ- 
ous notions of limitations in the 18-electron 
rule and provides complexes exhibiting very 
interesting bonding motifs in the process. 
Whether the distinctive properties of such 
compounds can be exploited remains to be 
seen, but the study foreshadows additional 
complexes that might be generated and test 
the limits of the 18-electron rule. 
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Global carbon intensity of 
crude oil production 


New data enable targeted policy to lessen GHG emissions 


By Mohammad S. Masnadi, Hassan M. El-Houjeiri, Dominik Schunack, Yunpo Li, Jacob 
G. Englander, Alhassan Badahdah, Jean-Christophe Monfort, James E. Anderson, 
Timothy J. Wallington, Joule A. Bergerson, Deborah Gordon, Jonathan Koomey, Steven 
Przesmitzki, Inés L. Azevedo, Xiaotao T. Bi, James E. Duffy, Garvin A. Heath, Gregory 
A. Keoleian, Christophe McGlade, D. Nathan Meehan, Sonia Yeh, Fengqi You, Michael 


Wang, Adam R. Brandt 


roducing, transporting, and refining 

crude oil into fuels such as gasoline 

and diesel accounts for ~15 to 40% of 

the “well-to-wheels” life-cycle green- 

house gas (GHG) emissions of trans- 

port fuels (J). Reducing emissions 
from petroleum production is of particular 
importance, as current transport fleets are 
almost entirely dependent on liquid petro- 
leum products, and many uses of petroleum 
have limited prospects for near-term substi- 
tution (e.g., air travel). Better understand- 
ing of crude oil GHG emissions can help to 
quantify the benefits of alternative fuels and 
identify the most cost-effective opportunities 
for oil-sector emissions reductions (2). Yet, 
while regulations are beginning to address 
petroleum sector GHG emissions (3-5), and 
private investors are beginning to consider 
climate-related risk in oil investments (6), 
such efforts have generally struggled with 
methodological and data challenges. First, 
no single method exists for measuring the 
carbon intensity (CI) of oils. Second, there is 
a lack of comprehensive geographically rich 
datasets that would allow evaluation and 
monitoring of life-cycle emissions from oils. 
We have previously worked to address the 
first challenge by developing open-source 
oil-sector CI modeling tools [OPGEE (7, 8), 
supplementary materials (SM) 1.1]. Here, we 
address the second challenge by using these 
tools to model well-to-refinery CI of all major 
active oil fields globally—and to identify ma- 
jor drivers of these emissions. 

We estimate emissions in 2015 from 8966 
on-stream oil fields in 90 countries (SM 
14.4). These oil fields represent ~98% of 
2015 global crude oil and condensate pro- 
duction. This analysis includes all major 
resource classes (e.g., onshore/offshore and 
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conventional/unconventional) and accounts 
for GHG emissions from exploration, drill- 
ing and development, production and ex- 
traction, surface processing, and transport 
to the refinery inlet (collectively called “up- 
stream” hereafter). These results are based 
on data from nearly 800 references, includ- 
ing government sources, scientific literature, 
and public technical reports (SM 1.4.1, 1.4.4, 
and table S17). Proprietary databases are 
used to supplement these data when infor- 
mation is unavailable in the public domain 
(generally for small oil fields). The latest In- 
tergovernmental Panel on Climate Change 
(IPCC) 100-year global warming potential 
(AR5/GWP100) factors are used in this work 
(SM 1.2.1). 


COUNTRY-LEVEL UPSTREAM 

CARBON INTENSITY 

The first figure presents the first upstream 
country-level volume-weighted-average CI 
estimates and their corresponding error bars 
(see fig. S22 for the global upstream CI map). 
Error bars are computed by using probabilis- 
tic uncertainty analysis solely associated with 
missing input data (SM 17 and 2.4). The CI 
estimates of some countries with poor data 
quality (e.g., Russia) are more uncertain (SM 
14.6 and 2.3). 

The global volume-weighted-average up- 
stream CI estimate—shown by the vertical 
red line in the first figure—is 10.3 g CO,equiv- 
alents (CO,eq.)/megajoule (MJ) crude oil 
(+6.7 and -1.7), with country-level intensities 
ranging from 3.3 (Denmark) to 20.3 (Algeria) 
g CO,eq./MJ. Carbon dioxide and methane 
contribute on average 65% and 34% of total 
CO,eq. emissions, respectively (SM 2.2). The 
total petroleum well-to-refinery GHG emis- 
sions in 2015 are estimated to be ~1.7 Gt 
CO,eq., ~5% of total 2015 global fuel combus- 
tion GHG emissions. This estimate of total 
emissions is ~42% higher than an industry- 
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wide scaling of an estimate for 2015 from the 
International Association of Oil and Gas Pro- 
ducers (based on datasets comprising 28% of 
global production with uneven geographical 
coverage). See SM 3 for exploration of the dif- 
ferences between our analyses. 

Emissions shown in the first figure can 
vary substantially over time (9), but time- 
series data are generally missing on a global 
basis and so are not explored here. In general, 
oil production declines with oil field deple- 
tion but is also accompanied by a substantial 
increase in per-MJ GHG emissions due to use 
of enhanced recovery practices. Other factors 
(e.g., oil price, geopolitics) could also affect oil 
production and thus the temporal CI (9). 

Gas flaring (burning) practices have a 
considerable influence on the CI. If not eco- 
nomically salable, this gas is either flared, 
reinjected, or vented (directly emitting meth- 
ane). The estimated share of flaring emis- 
sions in the global volume-weighted-average 
upstream CI is 22% (ie., 2.3 g CO,eq./MJ). 
Flaring data are not widely reported by gov- 
ernments or companies, so for most regions, 
our analysis relies on satellite-estimated vol- 
umes computed using nighttime radiometry 
(SM 1.2.4 and 1.4.3.18). Some important con- 
ventional crude oil producers with above-av- 
erage global CI, such as Algeria, Iraq, Nigeria, 
Iran, and the United States, are also among 
the top 10 countries in flaring observed via 
satellite. The contributions of routine flar- 
ing to the total volume-weighted-average CI 
of these countries are estimated herein to be 
~41, 40, 36, 21, and 18%, respectively. Vari- 
ability between flaring data sources results 
in greater uncertainty for countries with high 
contribution of flaring to their CI. Figure 
$27 shows that gas venting instead of flar- 
ing increases the estimated GHG emissions 
substantially (SM 1.2.4 and 2.6). However, 
currently there is no reliable remote-sensing 
technology for measuring gas venting. 

As the major global producers of uncon- 
ventional heavy oils, Venezuela and Canada 
have high country-level CI. This is due to 
energy- and CO,-intensive heavy oil extrac- 
tion and upgrading. Enhanced oil recovery 
by steam flooding contributes to high CI in 
other locations, such as Indonesia, Oman, 
and California (USA). 

Although some giant North Sea offshore 
fields have shown rapidly increasing per-bbl 
(barrel) emissions due to depletion (9), they 
have low upstream GHG intensities when 
compared to many other global oil fields. This 
is in part due to stringent regulations on gas 
processing and handling systems and renew- 
able electric-power-from-shore initiatives. 
Saudi Arabia is the largest global oil producer 
but has a small number of extremely large 
and productive reservoirs. The country has 
low per-barrel gas flaring rates and low wa- 
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ter production—resulting in less mass lifted 
per unit of oil produced and less energy used 
for fluid separation, handling, treatment, and 
reinjection—and thus contributing to low CI. 


FIELD-LEVEL UPSTREAM 

CARBON INTENSITY 

The second figure shows a global field-level 
CI curve for our 8966 fields (sorted cumu- 
latively) and illustrates CI heterogeneity of 
global crudes (SM fig. S19; Results Data Excel 
file). Fields in the highest 5th percentile emit 
more than twice as much as the median field. 
Upstream mitigation measures should focus 
on fields in the upper end of the CI curve. 

Although crude density (requiring thermal 
extraction methods) and flaring are key de- 
terminants of a high CI (SM 1.5), the second 
figure shows that flaring is the more preva- 
lent driver: For the highest CI quartile (i.e., 
>1.2 g CO,eq./MJ) in this figure, 51% of crude 
volume comes from high flare fields (yellow, 
red), while 18% comes from heavy oil fields 
(yellow, blue). Only 4 and 9% of crude vol- 
umes from the rest of the sample (i.e., <11.2 g 
CO,eq./MJ) come from high flare and heavy 
oil fields, respectively. 

The cumulative CI curve uncertainty due 
to missing input data is computed via a 
Monte Carlo simulation and presented in fig. 
$25 (SM 1.7 and 2.4). 


POLICY IMPLICATIONS 
Although oil alternatives like electric vehicles 
are rapidly growing, society is likely to use 
large volumes of oil in the coming decades 
(10); thus, mitigation of crude oil CI is key. 
Our tools and dataset allow for improved 
analysis of the benefits of emissions mitiga- 
tion policies. We highlight three broad strat- 
egies to reduce GHG impacts: (i) resource 
management, (ii) resource prioritization, and 
(iii) innovative technologies. 

Performance-oriented fuel quality standard 
programs based on life-cycle analysis models 
have been implemented successfully and have 
created new regional market drivers (e.g. in 
California, British Columbia, the European 
Union). Relying on market forces and credit/ 
debit mechanisms, these fuel-agnostic policies 
do not dictate specific technologies to reduce 
the emissions but rather encourage innova- 
tion to comply with the quality mandates. 
To achieve greater impacts, such regional 
fuel standard policies are emerging nation- 
ally (e.g., Canada’s Clean Fuel Standard) and, 
subsequently, worldwide. These regulations 
should recognize the climate impact hetero- 
geneity of different crude oils (see the second 
figure) to reward improved production prac- 
tices with clear per-barrel incentives for the 
lowest CI producers (/0). 

The current lack of transparency about 
global oil operations makes this type of 
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National volume-weighted- 
average crude oil upstream 
GHG intensities (2015) 


The global volume-weighted carbon Cl estimate 

is shown (red line, ~10.3 g CO,eq./MJ). Error bars 
reflect 5th to 95th percentiles of Monte Carlo 
simulation to explore the uncertainty associated with 
missing input data (see SM 1.7 and 2.4). 
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analysis particularly challenging. Labor- 
intensive data gathering (as undertaken here) 
still results in large uncertainty in emissions 
estimates (SM 2.3 and 2.4). Thus, it is im- 
portant to adopt policies to make data from 
oil and gas operations publicly available. If 
done correctly, these data can be released 
without affecting competitiveness of enter- 
prises. Countries including Norway, Canada, 
the United Kingdom, Denmark, and Nigeria 
have led in this respect. As countries pledge 
their commitments to reduce country-level 
GHG emissions and transparent reporting 
under the Paris Agreement, it is essential for 
energy-intensive industries (such as the oil 
and gas sector) to regularly report their an- 
nual carbon footprints. New industry efforts 
such as the Oil and Gas Climate Initiative are 
beginning to tackle this challenge. 

CI curves for four hypothetical GHG miti- 
gation case studies are shown in fig. S26 (SM 
1.2.2 and 2.5). Two “no routine flaring” case 
studies restrict the flare-to-oil ratio (FOR) to 
be no higher than the global 5th and 25th 
percentiles. A fugitive emissions reduction 
scenario sets fugitive and venting emissions 
to be 0.2 g CO,eq./MJ, approximately the 
volume-weighted average from Norwegian 
oil fields in 2015 (SM 1.2.2). Cases with no 
routine flaring (moderate and extreme) have 
global volume-weighted-average CI reduced 
from 10.3 (current world) to 8.7 and 8.3 g 
CO,eq./MJ. Achieving the fugitive and venting 
reduction scenario results in 7.9 g CO,eq./MJ. 
These case studies mitigate 15% [262 mega- 
tons (Mt) CO,eq.], 19% (332 Mt CO,eq.), and 
23% (397 Mt CO,eq.) of the current annual 
global upstream estimate, respectively. A 
fourth case study, including both stringent 
flaring reduction and minimal fugitive and 
venting emissions, reduces the global average 
to 5.8 g CO,eq./MJ and results in ~43% (~743 
Mt CO, eq.) annual CI reduction. 

A simple calculation suggests that up- 
stream emissions from oil extraction can 
materially affect cumulative emissions caps. 
Assume a reduction in the current global 
volume-weighted-average CI to the current 
25th percentile (reducing emissions by ~3 g 
CO,eq./MJ). Such reductions would be pos- 
sible using the mitigation case studies from 
fig. S26. Given that a typical barrel of crude 
oil yields ~6000 MJ, this would result in ~18 
kg CO,eq./bbl emissions reduction. Also note 
that IPCC scenarios—even with aggressive 
adoption of alternative fuels used for trans- 
port—still result in projected cumulative oil 
consumption of >1 trillion barrels in the 21st 
century. Thus, at least 18 metric gigatons 
(Gt) CO, eq. (~12 Gt as CO, and ~6 Gt as CH,) 
could be saved over the century by mitigating 
oil-sector emissions through wise resource 
choices and improved gas management prac- 
tices. Considering additional mitigation op- 
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Global field-level upstream carbon intensity supply curve (2015) 
Contribution of high flaring (“Flare” with FOR >75th percentile of all fields) and oil density ("Heavy” with API gravity <22°) are shown. Bar width reflects the oil production 
of a particular field in 2015. Global GHG intensity percentiles (5%, 25%, 50%, 75%, 95%) are 4.7, 7.3, 9.1, 11.2, and 19.5 g CO,eq./MJ crude oil, respectively. 


Cumulative oil production (million barrels per day) 


portunities across the crude oil supply chain 
(e.g., improved refining), 18 Gt is likely an un- 
derestimate; other studies have estimated up 
to 50 Gt CO,eq. reduction potential (10). For 
a >66% chance to keep global average tem- 
perature increases below 2°C, a total of ~800 
Gt CO, can be emitted from 2017 forward (12). 
The petroleum sector reduction potentials 
outlined above are material on this scale. 

Extraction and processing of heavy oils 
and oil sands with current technologies is 
very energy- and carbon-intensive, and the 
ability to reduce the intensities is challeng- 
ing. Although market forces have recently 
led to investment shifts based on economics 
alone (12), other mechanisms exist to reduce 
emissions. Solar-powered steam generators 
developed for heavy oil fields in Oman and 
California can provide substantial mitigation 
benefit. More broadly, use of solar energy 
could result in sectorwide emissions reduc- 
tions on the order of 5 kg CO,eq./bbl (~1.7 g 
CO,eq./MJ) (13). For some key regions with 
high seasonality and poor economics of solar 
technology (like Canada), using energy in- 
puts with low carbon intensity (e.g., hydrogen 
sourced from wind and biomass), capturing 
CO, from oil sands extraction and upgrading 
facilities, and investing in new low-carbon 
technologies (e.g., nanoparticle-assisted in- 
situ recovery, or CO,-free production of H, 
from CH, via catalytic molten metals) would 
be beneficial. In addition, low-value but high- 
carbon products such as petroleum coke from 
upgraded oil sands could be sequestrated in 
lieu of combustion (J0). Countries with di- 
verse resources could reduce their national 
CI by prioritizing less carbon-intensive assets 
(e.g., tight oil), accompanied by stringent flar- 
ing and venting management. 

Flaring rates can also be reduced. The 
Global Gas Flaring Reduction Partnership 
(GGFR) reported a nearly continuous in- 
crease in global flared gas from 2010 to 2016. 
Flaring is a management and infrastructure 
problem and is not an unavoidable outcome 
of crude oil production. Plans for new oil 
field development should incorporate con- 
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servation methods (i.e., capture, utilization, 
and/or reinjection) to eliminate routine flar- 
ing. Canadian regulations point to a method 
for enforcement: For offshore fields where 
flaring is excessive, production rate restric- 
tions are imposed until flaring reductions 
are made (74). Initiatives like the World Bank 
GGFR Zero Routine Flaring by 2030 are a 
start, though these could be strengthened 
with international advisory, financial, and 
technical aid to help countries implement 
flaring reduction policies. Moreover, continu- 
ous monitoring and verification are essential 
not only for flare management but also for 
eliminating venting and fugitive methane 
emissions in the oil and gas sector. Modern 
surveillance using remote-sensing technolo- 
gies (e.g., flare- and methane-sensing satel- 
lites) could be supported and expanded (JO). 

Methane fugitive emissions and venting 
from oil and gas facilities are poorly detected, 
measured, and monitored, and thus, can in- 
crease the uncertainty associated with CI 
estimates. Recently, the International Energy 
Agency (IEA) estimated 76 Mt methane emis- 
sions from global oil and gas operations in 
2015, with ~34 Mt due to oil production (5). 
This prorates to ~4.6 g CO,eq./MJ crude oil, 
higher than this study’s estimate of methane 
contribution (~2.6 g CO,eq./MJ averaged 
from all global fields, from all fugitive emis- 
sions and venting). In many cases, reducing 
methane emissions can result in additional 
revenues from the captured methane. IEA 
estimates that around 40 to 50% of current 
methane emissions could be avoided at no 
net cost. The cost of mitigation is generally 
lowest in developing countries in Asia, Af- 
rica, and the Middle East, but in all regions, 
reducing methane emissions remains a cost- 
efficient way of reducing GHG emissions (15). 

Important questions remain with regard to 
the interactions of economics and emissions. 
The CI curve in the second figure reflects 
differences in CI, but crude oil production 
choices are obviously influenced by the inter- 
action of local production costs and the global 
price of oil. A market structure without car- 
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bon prices neglects differences in CI shown 
in the second figure. Future work needs to 
examine the interaction of supply economics 
and CI for different resource classes. 

Data-driven CI estimates such as this work 
can encourage prioritizing low-CI crude oil 
sourcing, point to methods to manage crude 
oil CI, and enable governments and investors 
to avoid “locking in” development of high-CI 
oil resources. However, future progress in 
this direction will rely fundamentally on im- 
proved reporting and increased transparency 
about oil-sector emissions. 
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CLIMATE POLICY 


A president amplifies 
unlikely activists 


In dispatches from the front lines of global warming, 
Mary Robinson pushes for humane climate policies 


By Louise Fabiani 


ne of the cruelest ironies of today’s 
world is that many of the nations 
least responsible for causing climate 
change find themselves most affected. 
Widely fluctuating weather patterns 
and rising seas caused by 
melting glaciers are already dis- 
placing millions of people, endan- 
gering food security, increasing 
epidemics, and destabilizing po- 
litical systems in these areas, most 
of which tend to hug the tropics. 
But even the Arctic is not safe: 
Temperatures have been rising in 
circumpolar regions more rapidly 


and stalemates than how we might really ad- 
dress the problems facing our planet. 

In her capacity as United Nations special 
envoy on climate change, a position she held 
from 2014 to 2015, Robinson attended count- 
less meetings with world leaders. It is another 
kind of person she encountered, however— 
including, notably, many women 
who had never dreamed of becom- 
ing activists—who enlivens this 
book. Through the testimonies of 
ordinary citizens, we learn about 
the horrendous toll drought is 
taking in Chad, the way warmer 
winters threaten the continued 
existence of Saami reindeer cul- 
ture in northern Europe, and the 


Climate Justice 


: Mary Robinson A x 
than anywhere, reducing sea ice plese, 2018. high cultural value the Vietnam- 
and turning the polar bear into a 176 pp. ese place on local forest products. 


poster child for the perils of global 

warming. Meanwhile, climate summits con- 
tinue, scientists publish studies, and our col- 
lective addiction to fossil fuels goes on. 

As the former president of Ireland, Mary 
Robinson knows about the doublespeak of 
global diplomacy all too well. Her new book, 
Climate Justice, is less about grim statistics 


The reviewer is a freelance science writer and 
culture critic based in Montreal, Québec, Canada. 
Email: m.l.fabiani47@gmail.com 
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Island nations such as Kiribati 
may have to move their entire populations 
before rising sea levels cover the land masses 
they call home. “[BJecause of its position on 
the international date line, Kiribati was the 
first country in the world to welcome in the 
new millennium,’ Robinson writes. “Now, in 
a tragic twist of fate, it may become the first 
one lost to the effects of climate change be- 
fore the dawn of the next century.” Yet in the 
2009 Copenhagen climate talks, delegates ig- 
nored Kiribati’s pleas to substantially reduce 
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Mary Robinson meets with local farmers in Tigray, 
Ethiopia, during a period of severe drought in 2016. 


CO, levels. Mitigation, including a proposal to 
design floating islands, seems to appeal more 
to decision-makers than does prevention. 

In this, as in nearly every other case of 
crisis management, the cost of “fixing” the 
problems that will arise as a result of global 
warming dwarfs that of almost every imagin- 
able measure we could take to stop it from 
occurring in the first place. But for whatever 
reason—Robinson does not speculate as to 
why we prefer to pay for a pound of cure over 
an ounce of prevention—our climate strate- 
gies are stuck in a wait-and-see mindset. 

Approaching climate change as an isolated 
issue would be wrong, no matter how much 
foresight we use. Robinson reminds the 
reader that we must also tackle the “poverty, 
inequality, and exclusion” that lock much 
of the world into patterns of destructive be- 
havior. This, along with the recalcitrance 
of businesses and politicians to break out 
of age-old modes of thinking, is one reason 
Robinson looks for inspiration in more tradi- 
tional forms of knowledge and the reason she 
advocates for increasing the participation of 
women and elders in indigenous communi- 
ties in climate discussions. These individuals 
tend to be the ones with hands-on experience 
of natural systems, she argues. 

Achieving “climate justice” also entails 
acknowledging the inevitable human cost 
of shifting to more ecofriendly systems. The 
‘Just Transition’ movement, for example, 
seeks to compensate workers who face re- 
duced or lost employment as renewable en- 
ergy sources replace coal, oil, and natural gas. 

To Robinson, the 2015 Paris Agreement felt 
like a step in the right direction. However, in 
2017 her worst fears were realized: President 
Donald Trump pulled the United States— 
the world’s second-biggest polluter—out of 
the contract. “It is unconscionable that the 
United States has simply walked away from 
its responsibility to people both at home and 
abroad, in the interest of short-term fossil 
fuel profits, and abandoned an agreement 
that was negotiated by more than 190 world 
leaders, over decades,” she writes. 

In the end, however, Robinson is optimis- 
tic about our ability to meet the challenges 
that lie ahead, if pragmatic about what will 
compel us to do so. “Enlightened self-inter- 
est,’ she assures readers—in the form of an 
Alaskan compelled to call her congressper- 
son over fears her house will fall into the sea, 
or an insurance company pressing for new 
tailpipe-emission regulations because of sky- 
rocketing claims—can turn the drive for per- 
sonal survival into help for many others. 


10.1126/science.aau3320 
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QUANTUM PHYSICS 


Understanding the double slit 


Glimpses of quantum truth appear in diverse 
interpretations of the famous physics experiment 


By Mélanie Frappier 


n his famous Lectures on Physics, Rich- 
ard Feynman argued that nothing more 
is needed to get a solid grasp of the be- 
havior of quantum objects than the 
simple double-slit experiment, in which 
electrons or photons are fired toward two 
thin openings cut in a screen. To Feynman, 
the double-slit experiment encapsulated 
quantum physics’s one and only mystery. Its 
results could be described but, he 


Taylor’s nonchalant approach to an early one- 
slit version of the experiment. 

It is surprising to learn that the double- 
slit experiment played a minor role in the 
early development of quantum theory— 
that is, until Ananthaswamy explains that 
it was not performed in the laboratory 
until the 1960s. Until then, it was only a 
thought experiment. 

Feynman might have obtained ideal data 
by imagining himself firing at the two fic- 
tive slits with a futuristic tungsten 


cautioned, could not be explained. 
Despite Feynman’s warning, the 
past 60 years have seen an explo- 


Through 
Two Doors 
at Once 


The Elegant 


electron gun, but in real life, physi- 
cists wrestled for decades—relying 
on everything from spider silk to 


sion of interpretations appealing 
to devices as diverse as pilot waves 
and parallel universes in the hope 
of elucidating quantum behavy- 
ior. So far, no interpretation has 


Experiment That 
Coptures the 


Through Two 


ingenious beamsplitters—to bring 
the experiment to life. But this is 
where the many iterations of the 
double-slit experiment really take 
center stage in the development of 


proven fully convincing, leading Doors at Once our understanding of quantum re- 
many physicists to conclude that 47,7 Ananthaswamy ality. Technological advances, we 


the theory’s mathematical formal- 
ism should be left uninterpreted 
and to demand a return to the “shut up and 
calculate” attitude that was prevalent among 
the physicists of Feynman’s generation. 

Veteran science journalist Anil Anan- 
thaswamy rejects this fatalistic perspective. 
On the contrary, he argues, a deeper under- 
standing of the quantum world can only be 
achieved by embracing the diversity of inter- 
pretations available to us, a claim he persua- 
sively defends in Through Two Doors at Once. 
In the book, he takes Feynman to task by of- 
fering a spirited introduction to the various 
quantum interpretations, examining their 
respective explanations of the supposedly in- 
scrutable experiment. 

Ananthaswamy starts his investigation 
with a description of the double-slit experi- 
ment that is so natural and elegant that one 
may forget that it took physicists close to 30 
years to develop the mathematical framework 
needed to describe it adequately. He rapidly 
recaps this struggle, from Planck’s original 
suggestion that energy might sometimes be 
quantized to the Bohr-Einstein debates of 
the 1930s. Here, Through Two Doors at Once 
offers little more than the usual narrative, 
apart from an amusing detour through G. I. 


The reviewer is at the History of Science and Technology 
Program, University of King's College, Halifax, NS B3H 2A1, 
Canada. Email: melanie.frappier @ukings.ca 
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learn, prompted physicists to con- 
duct ever-more-sophisticated ver- 
sions of the experiment, which in turn fueled 
a greater variety of interpretations. This in- 
creased the need for ever-more-sophisticated 
experiments. 

Step by step, Through Two Doors at Once 
reveals how physicists transformed, re- 
tooled, and repurposed the original double- 
slit setup to throw light on the fundamental 
principles of quantum physics. Each it- 
eration of the experiment is, for Ananthas- 
wamy, an opportunity to introduce readers 


to fundamental concepts, such as entangle- 
ment; to analyze iconic experiments, such as 
Aspect’s test of Bell’s inequalities; or, more 
importantly, to examine the most promi- 
nent interpretations of quantum physics, 
from the Copenhagen interpretation of the 
1920s to the more recent “many interacting 
worlds” hypothesis. 

Ananthaswamy’s introduction of increas- 
ingly complex versions of the slit experi- 
ment proves extremely effective. Halfway 
through the book, even neophytes will likely 
find predicting the outcome of the delayed- 
choice quantum eraser experiment barely 
harder than figuring out the motions of a 
gear train. This approach also brings to the 
forefront the strengths and weaknesses of 
various interpretations, offering a perfectly 
balanced overview of each. 

But Ananthaswamy carefully guards him- 
self from offering any guiding principle that 
might help us decide which explanation is 
the best one. There is, he explains, no such 
thing as the “right” interpretation in good sci- 
ence. This does not mean, however, that we 
have to be mere instrumentalists and reject 
interpretations as misguiding fantasies. We 
have another, better option: We can decide 
to embrace the diversity of interpretations at 
our disposal because despite their respective 
flaws, each likely holds the key to at least one 
essential aspect of quantum behavior. 

Through Two Doors at Once offers begin- 
ners the tools they need to seriously engage 
with the philosophical questions that likely 
drew them to quantum mechanics. But 
readers will also receive a more important 
lesson, one that Feynman would have ap- 
proved: In science, a deep understanding 
is not achieved by limiting ourselves to a 
single perspective but by simultaneously ex- 
ploring competing conceptions of reality. & 
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Does the rising sun cause the rooster to crow? Questions of 
causality can finally be answered, write Pearl and Mackenzie. 


PODCAST 


The Book of Why 
The New Science of Cause 
and Effect 


Judea Pearl and Dana Mackenzie 
Basic Books, 2018. 429 pp. 


“Correlation does not imply causa- 
tion,” we have all been cautioned. 
How, then, are we to determine 
whether one thing has caused 
another? This week on the Science 
podcast, Judea Pearl and Dana 
Mackenzie discuss strategies for 
causal thinking and describe its impli- 
cations for artificial intelligence. 
Listen to the interview at: 
sciencemag.org/podcasts. 
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LETTERS 


Edited by Jennifer Sills 


Protecting U.S. 
temporary waterways 


Protecting the ecological health of rivers 
relies on maintaining intact flows from 
source areas to downstream navigable 
waters (1). Yet the U.S. Environmental 
Protection Agency (EPA) intends to 
rescind legal protection of tributary rivers, 
streams, and wetlands that do not have 
year-round flows (temporary waterways) 
and whose surface waters contribute flow 
to permanent navigable waters (2). This 
decision would severely damage the con- 
dition and uses of many U.S. waters, both 
temporary and navigable. 

Temporary waterways provide many 
ecosystem services, including water provi- 
sion and purification, that contribute 
substantially to securing water quantity 
and quality (3-5). Fifty-eight percent of all 
waterways that provide drinking water to 
the continental United States are tempo- 
rary or headwater streams, which support 
more than one-third of the United States’s 
population (6). Furthermore, temporary 
waterways harbor important biodiversity 
(5) and imperiled species (7) and underpin 
global carbon and nutrient cycles (8). Even 
when dry, they provide ecosystem services 
such as providing groundwater, attenuating 
toxicants, buffering floods, and providing 
habitat for unique biodiversity (5, 9). 

A comprehensive scientific review (0) 
of all the services provided by temporary 
waterways led to the decision in 2015 
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to recodify the definition of “waters of 
the United States” to include temporary 
waters hydrologically connected to navi- 
gable waters. This provided protection 
to many temporary waterways under the 
U.S. Clean Water Act and was hailed as a 
wise, well-informed decision (4). However, 
the recodification has not yet been 
implemented because the legal process is 
incomplete, and now reversal of the deci- 
sion is expected (2). 

We urge the EPA to uphold its 2015 
decision and to ratify the legal status and 
protection of temporary waterways. This 
would provide U.S. temporary waterways 
with a level of protection similar to that 
in other countries, such as Australia (5). 
Failure to do so sets a poor global prec- 
edent and, more importantly, risks costly 
(11) and potentially irreversible harm to the 
ecosystem services supported by temporary 
waterways in the United States, including 
the provision of secure potable water. 
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Privacy and genetic 
genealogy data 


Law enforcement use of genetic genealogy 
has recently led to identifications of miss- 
ing persons and suspected criminals (7). 
These successes have prompted discussions 
about the genetic privacy of individu- 

als whose DNA data are used in these 
investigations, particularly with regard to 
control over the usage of one’s data and 
the sensitivity of the information that is 
obtained [“Genealogy databases and the 
future of criminal investigation,’ N. Ram 
et al., Policy Forum, 8 June, p. 1078, and 
(2, 3)]. As policies for law enforcement use 
of genetic genealogy are contemplated, 
several factors that mitigate the threat to 
privacy should be considered. 

First, only data voluntarily uploaded 
and explicitly made public are searched. 
Direct-to-consumer DNA testing compa- 
nies, from which most data arise, do not 
voluntarily participate in law enforce- 
ment investigations. Although it is 
possible for investigators to compel such 
companies to disclose user information, 
neither Ancestry.com nor 23andMe has 
turned over genetic information to law 
enforcement (4, 5), and forensic genetic 
genealogy has not involved acquiring 
data in this way. Instead, investigations 
have relied on data that individuals have 
chosen to download from a testing com- 
pany’s database and upload to GEDmatch, 
a public genealogy database. GEDmatch 
is open to anyone, including law enforce- 
ment, who wants to check the database 
for indications of kinship with DNA data 
in their possession (including data from 
crime-scene or victim samples). Before 
allowing a new or existing user to access 
the site, GEDmatch prominently displays 
the full text of the Terms of Service and 
Privacy Policy, which advises individuals 
that GEDmatch can be used “by third par- 
ties such as law enforcement agencies to 
identify the perpetrator of a crime, or to 
identify remains” (6). 

Because no one is legally required 
to contribute to a genetic genealogy 


SCIENCE sciencemag.org 


database, and because the samples are 
not in the possession of government 
agencies, these searches are substantially 
different from familial searching of law 
enforcement databases (7). Jurisdictions 
that prevent or limit familial search- 
ing of those databases (8-12) should not 
automatically adopt identical restrictions 
on genetic genealogy investigations of 
publicly available databases. 

Another factor that lessens privacy 
concerns is that raw genetic data are 
not disclosed to law enforcement. Raw 
data contain highly personal and health- 
related information. Search results 
display only the length and chromosomal 
location of shared DNA blocks, which are 
used to determine approximate kinship 
relationships between individuals. The 
raw data cannot be accessed; only the 
possible genetic kinship among individu- 
als is shown. Customer relations create 
an incentive for testing companies and 
GEDmatch to maintain current policies of 
not releasing raw data without consent. 

Finally, genetic genealogy is for lead 
generation, not conviction. Genetic 
genealogy leads are tested by direct DNA 
matching to samples from persons of 
interest using standard forensic identifi- 
cation loci; only matches obtained with 
these well-established methods will result 
in continued investigation. 
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Chile’s salmon escape 
demands action 


After heavy winds and stormy conditions, 
about 650,000 salmon recently escaped 
from a net-pen aquaculture facility in 
southern Chile (7). This unintentional 
influx of salmon, a potentially invasive 
species in Chile (2), is just the latest of 
many escapes of farmed salmon and trout 
(3). The escape of these non-native species 
highlights the risk that such aquaculture 
facilities pose to native ecosystems (3, 4). 

Chile’s US$4.6 billion aquaculture 
industry has positioned the country as a 
global provider of salmon and trout food 
products (5), and government institu- 
tions have contributed to this success (6). 
However, the Ministry of Environment, 
Superintendency of Environment, and 
National Fisheries Service, which are 
responsible for safeguarding biodiversity 
and fishery resources against potentially 
invasive species, have failed to coordinate 
with the aquaculture industry to provide 
the necessary short- and long-term moni- 
toring of the escaped salmon (7). 

To balance industry development with 
protection of native ecosystems and spe- 
cies, Chile must initiate new measures 
and invest critical institutional funding. 
The government should begin by pass- 
ing legislation requiring industry to limit 
escapes and develop enhanced biosecu- 
rity technology to prevent them [e.g., 
(8)]. Chile should also allow catch and 
commercialization of escapees by local 
fishers and provide testing for antibiotics 
to determine whether the catch is suit- 
able for human consumption. Finally, to 
mitigate the impacts of salmon aquacul- 
ture, Chile should implement ecosystem 
approaches recognizing that aquaculture 
affects other stakeholders and multiple 
ecological goods and services (9). Chile’s 
public is largely resistant to salmon farm- 
ing because of the environmental risks 
(10), whereas the government and indus- 
try focus instead on the opportunities 
aquaculture provides for jobs and devel- 
opment. All sides need to come together if 
there is to be a future for salmon aquacul- 
ture in the region. 
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Comment on “Innovative scattering analy- 
sis shows that hydrophobic disordered 
proteins are expanded in water” 


Robert B. Best, Wenwei Zheng, Alessandro 
Borgia, Karin Buholzer, Madeleine B. 
Borgia, Hagen Hofmann, Andrea Soranno, 
Daniel Nettels, Klaus Gast, Alexander 
Grishaev, Benjamin Schuler 

Riback et al. (Reports, 13 October 2017, 

p. 238) used small-angle x-ray scattering 
(SAXS) experiments to infer a degree 

of compaction for unfolded proteins in 
water versus chemical denaturant that 

is highly consistent with the results from 
Forster resonance energy transfer (FRET) 
experiments. There is thus no “contra- 
diction” between the two methods, nor 
evidence to support their claim that 
commonly used FRET fluorophores cause 
protein compaction. 

Full text: dx.doi.org/10.1126/science.aar7101 


Response to Comment on “Innovative 
scattering analysis shows that hydrophobic 
disordered proteins are expanded in water” 
Joshua A. Riback, Micayla A. Bowman, 
Adam Zmyslowski, Catherine R. Knoverek, 


Published by AAAS 


John Jumper, Emily B. Kaye, Karl F. Freed, 
Patricia L. Clark, Tobin R. Sosnick 

Best et al. claim that we provide no convincing 
basis to assert that a discrepancy remains 
between FRET and SAXS results on the 
dimensions of disordered proteins under 
physiological conditions. We maintain that a 
clear discrepancy is apparent in our and other 
recent publications, including results shown in 
the Best et al. comment. A plausible origin is 
fluorophore interactions in FRET experiments. 
Full text: dx.doi.org/10.1126/science.aar7949 


Comment on “Innovative scattering analysis 
shows that hydrophobic disordered proteins 
are expanded in water” 


Gustavo Fuertes, Niccolo Banterle, 
Kiersten M. Ruff, Aritra Chowdhury, 
Rohit V. Pappu, Dmitri I. Svergun, 
Edward A. Lemke 

Editors at Science requested our input on the 
above discussion (comment by Best et al. 
and response by Riback et al.) because both 
sets of authors use our data from Fuertes et 
al. (2017) to support their arguments. The 
topic of discussion pertains to the discrepant 
inferences drawn from SAXS versus FRET 
measurements regarding the dimensions 

of intrinsically disordered proteins (IDPs) 

in aqueous solvents. Using SAXS measure- 
ments on labeled and unlabeled proteins, 

we ruled out the labels used for FRET 
measurements as the cause of discrep- 

ant inferences between the two methods. 
Instead, we propose that FRET and SAXS 
provide complementary readouts because of 
a decoupling of size and shape fluctuations 
that is intrinsic to finite-sized, heteropoly- 
meric IDPs. Accounting for this decoupling 
resolves the discrepant inferences between 
the two methods, thus making a case for the 
utility of both methods. 

Full text: dx.doi.org/10.1126/science.aau8230 


Editor’s Note: To expedite publication, 

we have decided to post some Technical 
Comments before their responses, which will 
run ina later issue. 


Comment on “Unexpected reversal of C, 
versus C, grass response to elevated CO, 
during a 20-year field experiment” 

Ming Nie, Junyu Zou, Xiao Xu, Chao 
Liang, Changming Fang, Bo Li 

Reich et al. (Reports, 20 April 2018, p. 317) 
reported that elevated carbon dioxide 
(eCO,) switched its effect from promoting C, 
grasses to favoring C, grasses in a long-term 
experiment. We argue that the authors did 
not appropriately elucidate the interannual 
climate variation as a potential mechanism 
for the reversal of C,-C, biomass in response 
to eCO,,. 

Full text: dx.doi.org/10.1126/science.aau3016 
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DISORDERED PROTEINS 


Comment on “Innovative scattering 
analysis shows that hydrophobic 
disordered proteins are expanded 


in water” 


Robert B. Best’*, Wenwei Zheng”, Alessandro Borgia*, Karin Buholzer*, 
Madeleine B. Borgia®, Hagen Hofmann‘, Andrea Soranno’, Daniel Nettels’, 
Klaus Gast®, Alexander Grishaev’, Benjamin Schuler®** 


Riback et al. (Reports, 13 October 2017, p. 238) used small-angle x-ray scattering (SAXS) 
experiments to infer a degree of compaction for unfolded proteins in water versus 
chemical denaturant that is highly consistent with the results from Forster resonance 
energy transfer (FRET) experiments. There is thus no “contradiction” between the two 
methods, nor evidence to support their claim that commonly used FRET fluorophores 


cause protein compaction. 


iback et al. (1) recently presented a “mo- 

lecular form factor” (MFF) method ad- 

dressing the well-known challenges (2) 

of analyzing small-angle x-ray scattering 

(SAXS) data for unfolded or intrinsically 
disordered proteins (IDPs) (3, 4). Combined with 
the precision of SAXS measurements coupled 
to size exclusion chromatography, their method 
yielded the following results: (i) Unfolded pro- 
teins in water have a polymer scaling exponent 
vz1/,, near the theta-solvent condition where 
protein-protein and protein-solvent interactions 
are balanced; in denaturant, this increases to 
v/s, the limit where the protein-solvent inter- 
actions dominate. (ii) This change of scaling ex- 
ponent is accompanied by an increase in radius 
of gyration (R,) of 10% to 20%, depending on the 
sequence. We are pleased that these findings are 
in overall agreement with SAXS and Forster 
resonance energy transfer (FRET) studies from 
our laboratories (3, 5, 6) and others (4). 

The chain expansion observed by Riback et al. 
helps to resolve a long-standing controversy be- 
tween SAXS and FRET experiments (7): With 
increasing denaturant concentration, FRET ex- 
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periments had generally shown chain expansion, 
while until recently (3, 4) most SAXS studies 
observed no statistically significant change of Rg 
[(8) and references therein]. Their results are 
consistent with our recent collaborative study 
in which we compared SAXS and FRET esti- 
mates of R, for each of two proteins (necessary 
because chain dimensions can be sequence- 
dependent). We found that the results are mu- 
tually consistent if both data types are analyzed 
with state-of-the-art methods (3, 6) (Fig. 1). A 
second study of a large group of IDPs reached a 
similar conclusion (4). 

The main reasons for the discrepancy were 
deficiencies in the analysis of both SAXS and 


Fig. 1. Rg from Bayesian 
ensemble refinement 

against experimental data 

for unfolded proteins in 
denaturant using FRET, SAXS, 
or both experiments. Results 
are shown for two proteins 
(ACTR, R17) in urea and 
guanidinium chloride (GdmCl) 
(3). Re values from FRET 0 
using Gaussian chain 

or SARW models (3) are 
shown for reference. 


Rg (nm) 


[GdmcCl] (M) 


FRET data. Earlier SAXS experiments under- 
estimated expansion because the unfolded state 
of the foldable sequences studied could only be 
accessed above a certain denaturant concentra- 
tion (3), as also pointed out by Riback et al., and 
because obtaining precise and accurate R, values 
from SAXS data of IDPs using the Guinier ap- 
proximation is challenging (/, 3, 4). The former 
difficulty has been overcome by studying de- 
stabilized or intrinsically disordered proteins 
C, 3, 4), the latter by improved analysis such as 
Bayesian ensemble refinement (3, 4, 6, 9) or the 
closely related MFF method (J). On the FRET side, 
the use of polymer models, such as a Gaussian 
chain or self-avoiding random walk (SARW), to 
interpret experimental results can overestimate 
the change in R, (3, 10, 11) (Fig. 1), largely be- 
cause the relative change of R, upon chain ex- 
pansion is intrinsically less than that of the 
end-to-end distance most commonly measured 
by FRET (3, 4, 10, 11). With ensemble refine- 
ment applied to either SAXS or FRET, or both 
combined, the data from both experiments yield 
consistent distributions of conformations, con- 
sidering statistical error (3, 4, 6), as shown in 
Fig. 1 (3). 

We therefore dispute the authors’ claim that 
their results are “in apparent contradiction to a 
variety of FRET measurements,” given that FRET 
experiments have not been reported on the se- 
quences they studied. Their results are consistent 
with the magnitude of the change in both R, and 
v with denaturant inferred from recent FRET 
studies (3, 4), including the larger change in R, 
and v below 2 M GdmCl (Fig. 2A). Their v val- 
ues of 0.48 to 0.54 in water or low denaturant 
concentration are within the range (reflecting 
sequence-dependent variation of v) obtained on 
the basis of ensemble refinement of data from 
previous FRET studies (3-6) (Fig. 2B). 

Despite this consistency, the authors suggest 
that “addition of fluorophores with hydrophobic 
character may lead to chain compaction and may 
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Fig. 2. Polymer scaling A 
exponents, v, for 
unfolded or disordered 0.6 Lv = 3/5 


proteins. (A) Denaturant- 
dependent v from SAXS 
data of Riback et al. for PNt 


(red) and P domain (blue) A . Csp (FRET, Bayes) 
(1) compared with those Vv 0.5-- PN = jo) A hCyp (FRET, Bayes) 
from FRET data for a V// > IN (FRET, Bayes) 
variety of unfolded and |. ¢ R15 (FRET, Bayes) 
disordered proteins (gray) 4 (RI Z (FRE TL Bayes) 
(5) and for the IDP ACTR ACT rE eves) 

ae : 0.4 || R17 (FRET, Bayes) ea 
and a destabilized spectrin @ —_PNt(SAXS, MFF) 
R17 domain (green) (3). O __PNt(SAXS, Bayes) 
Exponents were obtained tv = 1/3 @ = P-domain (SAXS, MFF) | 4 


from Bayesian ensemble im 


refinement (3) of primary (@) 
FRET or SAXS data 


2 3 4 5 6 7 8 


“Bayes”) or MFF analysis 

where indicated. Curves B v 

are fits to a binding model — v= 8/5 4 

5) or to a polyelectrolyte 0.6 -* ee a yy Se 

model for IN and ACTR fl a 

3). (B) Scaling exponents r ¥ 

versus Kyte-Doolittle v= 1/2 

hydrophobicity (15) 0.5+----- Gee ee ae 

rescaled between O Vv 

and 1) for the same pro- > Borgia 2016, SAXS, Bayes J 
teins in water or low © Riback 2017 SAXS, Bayes 

denaturant concentration, 0.4 ° eee | 
as well as additional data , A sFuertes 2017, SAXS+dyes, Bayes 
for a set of IDPs in water W  Fuertes 2017, FRET, Bayes 
from Fuertes et al. (A). Ms Hofmann 2012 FRET, Bayes V= 1/3 | 
Results for MFF (1) and <q Borgia 2016, FRET, Bayes fT TTT TTT 
Bayesian ensemble 0.3 | ! | ; | 
0.3 0.4 0.5 


refinement (3) are highly 
consistent. Error bars 
indicate statistical error. 


contribute to FRET signal changes” (/). Although 
some extremely hydrophobic FRET fluorophores 
can indeed cause additional compaction under 
native conditions, ensemble refinement iden- 
tified the inconsistency between the resulting 
FRET and SAXS data (3). However, results for 
the more hydrophilic fluorophores most com- 
monly used were in good agreement with SAXS 
data (3) (Fig. 1). Furthermore, a recent tour- 
de-force SAXS study of proteins with and with- 
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Hydrophobicity 


out fluorophores showed only small perturbations 
and no systematic changes of R, and v upon 
labeling (4) (Fig. 2B). The evidence presented 
by Riback et al. to support their claim comes 
not from a protein but from earlier small-angle 
neutron scattering (SANS) and FRET measure- 
ments on polyethylene glycol (PEG) (12). PEG lacks 
complications from a folded state, such as those 
that previously (13) caused these authors to over- 
look ubiquitin expansion (J). The PEG study, 


however, used old protocols to analyze the data. 
Applying such earlier methods to a protein lacking 
a folded state, the authors had determined that 
“fully reduced ribonuclease A does not expand 
at high denaturant concentration” (74), but they 
now find an expansion for the same protein 
[figure 3C in (J)]. 

Riback et al. thus do not provide a convincing 
basis for their assertion that the conclusions of 
FRET and SAXS experiments are contradictory. 
Rather, their results add to the increasingly con- 
sistent picture of unfolded and intrinsically dis- 
ordered proteins that has been emerging in 
recent years. 
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Best et al. claim that we provide no convincing basis to assert that a discrepancy remains 
between FRETand SAXS results on the dimensions of disordered proteins under physiological 
conditions. We maintain that a clear discrepancy is apparent in our and other recent 
publications, including results shown in the Best et al. comment. A plausible origin is 


fluorophore interactions in FRET experiments. 


sing our new molecular form factor (MFF), 

we analyzed small-angle x-ray scattering 

(SAXS) data for three intrinsically disor- 

dered proteins (IDPs) and found that upon 

a shift from 6 to 0 M guanidinium chloride 
(Gdn), there was a mild decrease in radius of 
gyration (Rg) (1). For these 124- to 334-residue 
IDPs with amino acid sequences typical of fold- 
able proteins, the value of R, decreased by 20 to 
28% in water. As predicted from scaling laws, 
the denatured states of smaller proteins such as 
ubiquitin and protein L (76 and 64 amino acids, 
respectively) will contract less (17 and 15%, re- 
spectively). Notably, approximately half of this 
contraction occurs below 1 M Gdn, beyond the 
limit of many prior kinetic studies. For example, 
for ubiquitin, only an 8% decrease (~2 A) in R, is 
expected from 6 to 1M Gdn. This mild and non- 
linear decrease in R, explains why many previous 
SAXS studies of small proteins did not observe 
a Statistically significant decrease in Rg (2-6)— 
a conclusion we highlighted in figure 4A of our 
publication (J). 

In contrast, Forster resonance energy transfer 
(FRET) studies have observed much more con- 
traction, up to a 50% decrease in R, in the absence 
of denaturant, even for small proteins (3, 7, 8). 
Furthermore, the majority of this contraction 
occurs above 1 M Gdn (Fig. 1). These findings are 
different from those observed using SAXS and 
are noticeable in figure 1 of Best et al. (9) (com- 
pare green and black points at lowest denaturant 
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concentration). Hence, our methods and those 
of Best et al. concur that in the absence of de- 
naturant, SAXS analysis returns R, values ~15% 
larger than FRET studies when each data set is 
analyzed independently. In our publication, we 
hypothesized that this discrepancy could be due 
to the addition of dyes necessary for FRET studies 
(2). In contrast, Best et al. provide no physical ex- 
planation and instead rely on a joint analysis of 
both types of data that only reduces the appear- 
ance of the discrepancy. 

An alternative, chain length-independent 
method to compare results from SAXS and FRET 
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studies is to compare the Flory exponent (v in the 
relationship R, < N’, where N = protein length). 
Values of v equal to 0.33 or 0.60 correspond to a 
globule or a self-avoiding random walk (SARW), 
respectively. In a theta solvent, v = 0.50 and in- 
trachain interactions are equally as favorable as 
solvent-chain interactions; this value defines the 
boundary between good and poor solvents. For 
the three foldable sequences we examined using 
SAXS, v in water is 0.54. In contrast, using FRET, 
Hofmann et al. (10) found that v for four foldable 
sequences in water ranged from 0.4 to 0.51. Be- 
cause the range of v between a globule and a 
SARW is only 0.27, a difference of 0.1 units rep- 
resents 38% of the entire range. Hence, to argue 
that both methods are in agreement if the values 
of v are near % is imprecise. 

This discrepancy is not limited to the specific 
proteins selected by us or by Hofmann e¢ al. (10). 
We reanalyzed available SAXS data for other 
IDPs with sequences typical of folded proteins 
(11-17) using our new MFF procedure and com- 
pared these results to v extracted from available 
FRET studies of disordered proteins with fold- 
able sequences (Fig. 2A). When measured by SAXS, 
v is typically above 0.54, whereas v from FRET 
studies is typically below the theta condition of 
0.50. Note that our Fig. 2A should, in principle, 
match figure 2B of Best et al. However, figure 2B 
of Best et al. does not include results from two of 
the most collapsed proteins that Fuertes et al. (11) 
studied by FRET: R98 and NSP. The FRET and 
SAXS values for these proteins differ significantly 
(v = 0.44 versus 0.56 for R98 and 0.49 versus 0.60 
for NSP, respectively). Moreover, Best et al. plot 
results for other proteins (e.g., Csp, hCyp, and 
R15) in ~1 M denaturant rather than the value ex- 
trapolated to 0 M denaturant, as they calculated 
in figure 2A and reported in their original study 
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Fig. 1. SAXS and FRET exhibit different denaturant dependence for two proteins, R17 and 
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ACTR, studied by Best and co-workers. All primary data are from (22). Open and solid red circles are 
two models applied to FRET data (reweighted simulations and a SARW, respectively), as presented in 
(22). Open and solid black circles are SAXS data fit using reweighted simulations [as presented in (22)] 
and our MFF (1), respectively. The black line is the expected SAXS denaturant dependence of R, taken 
from (1), assuming that 2 M urea is equivalent to 1 M Gdn. 
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Fig. 2. SAXS and FRET show inconsistent solvent qualities for IDPs 

in water. (A) Trends of hydrophobicity (Kyte-Doolittle scale) versus v in 
water for SAXS data of foldable protein sequences fit to our MFF for data 
from our recent study (1) (black circles), Fuertes et al. (11) (black squares), 
and other studies (12-17) (black triangles). Also shown are results from FRET 
studies calculated as in (10) for data from (10) (red circles) and (11) (red 
squares). The red trend line for FRET results is from (10). The black trend 
line is the best fit to the SAXS data shown. At the top is a histogram of 


[Hofmann et al. (10)]. These differences mislead- 
ingly minimize the discrepancy between the SAXS 
and FRET results. Plotting instead results for 
all proteins with sequences typical of folded 
proteins in the absence of denaturant, as in our 
Fig. 2A, reveals clear discrepancies between re- 
sults from these methods. 

Figure 2 of Best et al. emphasizes the sole ex- 
ception to v > 0.5 in our SAXS data: the P domain, 
a hydrophobic IDP region of low sequence com- 
plexity whose collapse correlates with stress 
granule formation (18). Although this finding 
indicates that some hydrophobic, nonfoldable 
sequences can collapse, it does not speak directly 
to the discrepancy between SAXS and FRET re- 
sults for protein-like sequences. 

Having established that discrepancies re- 
main between results of SAXS and FRET studies, 
we considered possible origins. As we discussed 
(1), Chan and others have proposed that com- 
plications in converting FRET efficiency to 
R, could account for some of this difference 
(11, 19-21). The other obvious issue is the pres- 
ence of fluorophores. 

Schuler and co-workers have found that some 
fluorophore pairs influence FRET signals but 
maintain that the Alexa 488/594 pair is suitable 
(22). Their all-atom molecular dynamics simula- 
tions, however, reported a 10% contraction for 
an IDP with versus without fluorophores in 1 M 
urea (23). In the absence of denaturant, this dye 
effect would presumably be even larger. Recently, 
Fuertes et al. conducted SAXS measurements on 
five proteins with and without Alexa 488/594 (11). 
In all cases, addition of fluorophores changed the 
SAXS profile. From the change in scattering between 
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the labeled and unlabeled proteins, the authors 
calculated meaningful differences (Vuniabelea - 
Viabelea = Av = 0.08, 0.03, 0.03, -0.02, -0.04). Re- 
analysis of these data with our MFF (J) finds Av = 
0.09, 0.06, 0.03, -0.02, -0.08 (Fig. 2B). To address 
an additional concern raised by Best et al., we 
reanalyzed published polyethylene glycol (PEG) 
scattering data (8) with our MFF and confirmed 
the published observation that whereas FRET- 
measured PEG labeled with Alexa 488/594 con- 
tracts in lower denaturant, PEG without labels 
does not contract (Fig. 2C). Hence, significant dis- 
crepancies exist between FRET and scattering 
techniques even when using fluorophores con- 
sidered suitable. 

We are gratified that Best et al. agree that water 
should be considered to be a good solvent for the 
denatured state of most foldable sequences (v > 
0.5). However, an analysis of the FRET results 
alone would lead one to believe the opposite (v < 
0.5). The presence of fluorophores in FRET studies 
is a likely origin of this difference. 
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hydrophobicity of representative proteins in the PDB [dataset from (1)] 
with sequences in the FRET-inferred poor solvent region highlighted in red. 
(B) SAXS profiles from (11) for unlabeled and labeled versions of the NLS 
protein fit using our MFF. Solid lines span data points used for fitting. Error 
bars are the SD appropriate for Poisson counting statistics. (C) PEG results 
from (8) remain unchanged with improved analysis. Small-angle neutron 
scattering (SANS) data (black) are fit to our MFF; FRET data (red) are fit 
assuming a SARW distribution and normalized to high denaturant. 
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Editors at Science requested our input on the above discussion (comment by Best et al. and 
response by Riback et al.) because both sets of authors use our data from Fuertes et al. (2017) 
to support their arguments. The topic of discussion pertains to the discrepant inferences 
drawn from SAXS versus FRET measurements regarding the dimensions of intrinsically 
disordered proteins (IDPs) in aqueous solvents. Using SAXS measurements on labeled and 
unlabeled proteins, we ruled out the labels used for FRET measurements as the cause of 
discrepant inferences between the two methods. Instead, we propose that FRET and SAXS 
provide complementary readouts because of a decoupling of size and shape fluctuations 
that is intrinsic to finite-sized, heteropolymeric IDPs. Accounting for this decoupling 
resolves the discrepant inferences between the two methods, thus making a case for the 


utility of both methods. 


uantitative descriptions of conformation- 
al ensembles of intrinsically disordered 
proteins (IDPs) are directly relevant for 
understanding the functions and cellular 
processes controlled by IDPs. Small-angle 
x-ray scattering (SAXS) and Forster resonance 
energy transfer (FRET) are two experimental 
techniques that have been widely used to quantify 
the overall sizes and shapes of IDPs in different 
milieus. The two techniques yield convergent de- 
scriptions regarding the dimensions of IDPs in 
high concentrations of denaturants, but they 
can yield discrepant inferences in the absence of 
denaturants (J-3). What is the source of these 
discrepant inferences? Are they caused by the 
dyes used in FRET measurements, as proposed 
by Riback et al. (4)? Or does the discrepancy 
come from the method of analysis of SAXS and 
FRET data, as proposed by Best et al. (5)? 

On the basis of direct SAXS measurements of 
IDPs with and without dyes, we have experimen- 
tally demonstrated that the dyes are not the 
source of systematic biases and the discrepant 
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inferences (1). However, as a general resolution 
of the conflict between SAXS and FRET, our re- 
sults point to sequence-specific decoupling be- 
tween end-to-end distances (R,) and radii of 
gyration (R,). It is important to note that Re is 
directly derivable from FRET data, whereas R, 
is directly derivable from SAXS data. FRET data 
cannot be used to extract R, without making a 
series of simplifying assumptions about the con- 
nections between R, and R,, and the converse 
is true of SAXS data. Indeed, much of the conflict 
in inferences drawn from SAXS versus FRET orig- 
inates from the fact that the two methods provide 
access to two different quantities. When we al- 
low for the possibility that Re and R, can be 
decoupled from one another because of shape 
fluctuations that are consequences of the finite 
size and the heteropolymeric nature of IDPs, we 
arrive at a reconciled view, which suggests that 
SAXS and FRET yield complementary rather 
than contradictory insights. Therefore, we pro- 
pose that the debate should not be about the 
merits or demerits of the two methods. Both 
methods have their strengths and weaknesses; 
therefore, the focus should be on the growing 
consensus regarding methodological advances 
that rely on improved numerical/theoretical analy- 
sis and the use of sophisticated atomistic simu- 
lations to analyze data from SAXS and FRET 
measurements. 

To directly test the effects of the dyes, we per- 
formed SAXS measurements with labeled and 
unlabeled IDPs on a series of molecules with dif- 
ferent sequence attributes (7). We pursued global 
fits to analyze the relationship of measured R, 
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and R, from several proteins giving a broader 
coverage of sequence space. Scaling theories sug- 
gest that R, oc N’, where the critical exponent v 
describes how the protein dimensions scale with 
the number of residues (NV). We agree that com- 
paring Av = Viabeled ~ Vunlabelea is One way to 
detect potential dye interferences. Indeed, it is 
reassuring that estimates of Av in the absence of 
denaturant show the lack of systematic trends 
in Av. The values for Av range from being neg- 
ative to positive, and importantly, Av is zero on 
average (Av = 0.03, 0.08, 0.03, -0.02, -0.04; aver- 
age 0.02 + 0.04). To test whether this level of 
deviation is meaningful or lies within the experi- 
mental uncertainty, we conducted SAXS and 
FRET under highly denaturing conditions, where 
all independent assessments converge. These ex- 
periments yield similarly small Av values ranging 
from positive to negative (Av = -0.07, -0.09, -0.01, 
0.04, 0.00; average -0.03 + 0.05). Our direct 
measurements negate the hypothesis that the 
selected FRET dyes used by many groups includ- 
ing Fuertes et al. and Borgia et al. (1, 3) are 
systematic modifiers of conformational ensembles, 
although sequence-specific interactions involv- 
ing specific dyes may prevail in some cases. 

In their response, Riback et al. picked out the 
sequence designated as NLS from our set of IDPs 
for their analysis [figure 2B of (4)]. For this se- 
quence, there is indeed a small but measurable 
difference between the SAXS data for unlabeled 
and labeled proteins. This observation is con- 
sistent with our analysis (7) but is limited to NLS, 
whereas other IDPs in our study showed vir- 
tually identical normalized scattering behavior 
for labeled versus unlabeled IDPs. Of course the 
label contributes a detectable increase in the 
molecular weight of the protein, and this is 
accounted for in the analysis and interpretation 
of the SAXS data (J). Focusing on the NLS se- 
quence reveals a pitfall of isolating a single mea- 
surement from an entire dataset to draw general 
conclusions. Of all the proteins in our dataset, 
NLS was the only protein for which we could 
not perform measurements in phosphate-buffered 
saline (PBS) because of confounding effects of 
the milieu. Instead, we had to perform mea- 
surements in a zwitterionic HEPES buffer. This 
“complication” is in line with this particular pro- 
tein being highly charged, and thus sensitivity 
toward certain buffers as well as charged dyes is 
to be expected. If one were to extrapolate from 
the NLS data and assert that dyes have an un- 
ambiguous perturbing effect on FRET measure- 
ments, then one would need to conclude that in 
general no measurements can be made for pro- 
teins in physiologically relevant PBS buffers, thus 
highlighting the issues one confronts with mak- 
ing extrapolations from a specific data point. 

Having largely exonerated the dyes as the 
source of any form of systematic bias, we are left 
with the question of why the extent of contrac- 
tion previously inferred from FRET is higher 
than from SAXS and why this difference is mani- 
fested for IDPs in the absence of denaturant. We 
propose that the discrepant degrees of contrac- 
tion inferred from FRET versus SAXS originate 
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in the distinct types of quantities that the two 
techniques measure and the assumptions used 
to convert between them. This proposal is con- 
sistent with issues raised by Borgia et al. (3). 
Converting FRET efficiency (Egprr) into an aver- 
age R. value requires the use of theoretical 
polymer models or of ensembles derived from 
computer simulations. Additionally, SAXS pro- 
vides inferences regarding R,, whereas FRET 
provides inferences regarding R.. To compare 
the two inferences, one simple approach is to 
convert between R, and R, using a multiplicative 
factor—for example, R, = R./ V6 for a Gaussian 
chain. However, all IDPs cannot be taken for 
granted to be Gaussian chains in different milieus. 

Motivated by the dependence on polymer mod- 
els for the values of R, extracted from FRET ef- 
ficiencies (6, 7), we abandoned the ansatz that R, 
and R, must be coupled by a unique constant. 
This is justified because (i) proteins are not 
homopolymers [e.g., PEG (8)], and (ii) different 
geometrical objects have very different R,/R. 
relationships (defined as G = R,”/R.”) such that 
G = 1.31 for a sphere, G = 12 for a rod, G = 6 for 
an ideal Gaussian chain with a scaling exponent 
v = 0.5, and G = 7.04 for a self-avoiding random 
walk, where v = 0.6. Consequently, when mea- 
suring only R,, there exist multiple solutions for 
R, depending on the shape of the IDP ensemble 
and vice versa. We showed that R, and R, are two 
related but genuinely distinct measures of IDP 
conformations. Away from the Gaussian chain 
limit, the two quantities can be readily decoupled 
from one another. Accordingly, the two measures 
together provide complementary albeit distinct 
insights regarding the dimensions of IDPs in the 
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absence of denaturant. Using only one measure 
as opposed to both R. and Ry leads to discrepant 
inferences because the models used to interpret 
either dataset end up imposing a homogeneity 
that appears to be unwarranted for IDPs in the 
absence of denaturant (7). We propose that 
the extent of decoupling between R, and R, 
depends on the amino acid sequence and so- 
lution conditions. 

Overall, using ensemble-based numerical anal- 
ysis methods that avoid simplifications for the 
conversion between R, and R, can help to mini- 
mize apparent discrepancies between SAXS and 
FRET (J, 3, 7). Along these lines, Best et al. sum- 
marize a method that was used by Borgia et al. 
to integrate SAXS and FRET data (3). The ap- 
proaches used by Borgia et al. (3) and Fuertes et al. 
(1) are complementary and the findings that 
emerge are in general agreement with one an- 
other. Both groups recognize the weaknesses of 
simplified homopolymer models for analyzing 
SAXS and FRET data. Neither group finds de- 
tectable evidence for systematic compacting 
effects of dyes. Minor positive and negative de- 
viations in the inferred scaling exponents from 
FRET vis-a-vis SAXS are to be expected for finite- 
sized heteropolymers, given the errors and fea- 
tures that are associated with both experimental 
methodologies. 

To conclude, it is worth reiterating that our 
statement that “dyes are not the source of the 
discrepancy” results from direct measurements 
rather than conjecture. Further, our explanations 
for resolving the discrepant inferences between 
SAXS and FRET invoke a decoupling of R, and 
R,. This hypothesis is substantiated by numerical 
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evidence for the decoupling that explains all the 
available data. Nonetheless, it remains a phenom- 
enological model and demands careful theo- 
retical and experimental scrutiny. 
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Reich e¢ a/. (Reports, 20 April 2018, p. 317) reported that elevated carbon dioxide (eCOz) switched its 
effect from promoting C3 grasses to favoring C, grasses in a long-term experiment. We argue that the 
authors did not appropriately elucidate the interannual climate variation as a potential mechanism for the 


reversal of C4,-C3 biomass in response to eCQz. 


Reich et al. (1) presented results of a long-term free-air CO. 
enrichment experiment. The results showed that elevated 
CO. (eCO.) favored C3 grasses rather than C, grasses during 
the first 12 years; however, the pattern reversed during the 
subsequent eight years. It appears that their observations 
regarding the changes in C4-C3 grasses under eCO, condition 
did not reflect the effects of inter-annual variations in the 
ambient rainfall and temperature during the 20-yr experi- 
mental period, leading to uncertainties in their results. 

The effect of eCO,. on plant biomass largely depends on 
the ambient rainfall and temperature (2, 3). However, Reich 
et al. (1) found that the responses of C, and C3 grasses to 
eCO, had negligible dependence on these important climatic 
factors, determined by estimating the effects of multiple 
variables on C4-C; biomass with repeated-measures analysis. 
According to the Cedar Creek weather data from Fort Snel- 
ling near the Saint Paul International Airport, the ambient 
total rainfall (316-722 mm) and average temperature (18.6- 
21.4°C) during the growing season had considerable inter- 
annual variations during the 20-yr experimental period. Us- 
ing generalized linear models, we found that both the grow- 
ing-season rainfall and average temperature positively 
correlated with the effect of CO. on Cs biomass and the 
growing-season average temperature negatively correlated 
with the effect of CO. on C3 biomass (Fig. 1). Without poten- 
tial collinearity among the explanatory variables and order 
effects of repeated-measures analysis, our analysis is more 
appropriate to estimate the effect of individual variable on 
the response of C, or Cs biomass to eCO». with an accurate 
and interpretable predictor. 

The change in Cx biomass showed a sharp decrease 
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from 2005 to 2008 (Fig. 2), and the C3 biomass also reached 
the lowest level during this period (7). Water stress during 
summer might have led to the decrease in biomass because 
summer rainfall during these dry years was about 53% less 
than the average of other years (Fig. 2). After these dry 
years, eCO, favored C, but not C3 grasses. Besides the 
asymmetric changes in net nitrogen mineralization rates 
between C, and C; soils as suggested by Reich et al. (J), we 
offer two other possible mechanisms for the “winner’—C, 
grasses. First, increased growing-season average tempera- 
ture might favor C, than C3; grasses under eCO, condition. 
The growing-season average temperature significantly in- 
creased by approximately 0.98°C before and after the dry 
years (Fig. 2; t = -3.6; P < 0.01). By the two-way ANOVA 
with CO, (ambient CO, versus eCO.) and average growing- 
season temperature (before versus after the dry years) as 
fixed factors to determine the effects of eCO. and tempera- 
ture on the 3-yr moving averaged C, biomass, we found that 
increased growing-season temperature might interact with 
eCO, to affect logio-transformed C, biomass (F = 4.4; P < 
0.05). As suggested by other studies and as shown in Fig. 1, 
the warm-season C, grasses can grow better than C; grasses 
under higher temperature conditions (4), and can enhance 
their sensitivity to eCO. with increasing temperature when 
soil moisture content is not limited (4-6). Second, C3 grasses 
as cool season species lose their positive responses to eCO» 
with increase in the ambient temperature as shown in Fig. 1. 

Understanding the directions and magnitudes of re- 
sponses of C, and Cz grasses to eCO, is crucial in modeling 
carbon-climate feedbacks. It is difficult to predict the 
changes in plant biomass dynamics in an intricate ecosys- 
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tem based only on the photosynthetic pathways. Several 
studies have shown that the relative effects of eCO. on the 
biomass of C, and C3; grasses are highly influenced by soil 
water availability and temperature (2-6). We argue that the 
interpretation of the biomass data would be more meaning- 
ful by appropriately considering the effects of inter-annual 
variations in the ambient rainfall and temperature. Our 
analysis and interpretation of the biomass data provides 
insights different from those of Reich ez al. (1), but we fully 
support their call for long-term experiments. 
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CO, effect on total biomass (g m-?) 
C, grasses 


C, grasses 


Growing-season rainfall (mm) | Average growing-season temperature (°C) 


Fig. 1. Relationships between the CO: effect on total C4-C3 biomass and growing-season climate. 
The biomass data are from the measurements of Reich et al. (1). CO2 effect size = biomass under 
eCQOz condition — biomass under ambient CO2 condition. The Cedar Creek weather data are from Fort 
Snelling near the Saint Paul International Airport (www.wunderground.com/history/airport/KMSP). 
The biomass and weather data are shown as 3-yr moving averages centered over the middle of each 
3-yr group. The relationships between CQz effect size and climatic factors were analyzed using 
generalized linear models (COz2 effect size — temperature + rainfall; family = Gaussian; link = identity). 
The partial R* of each climatic factor was obtained using the rsq. partial function with the rsq package 
in the R version 3.2.2. 
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Fig. 2. Inter-annual trajectories of C. total biomass at the ambient (red) and elevated COz2 (blue) 
levels and summer rainfall (orange). The Biomass data from the measurements of Reich et al. (1) are 
shown as 3-yr moving averages centered over the middle of each 3-yr group. The Cedar Creek weather 
data are from Fort Snelling near the Saint Paul International Airport. 
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AAAS NEWS & NOTES 


AAAS analyst 

assists a human 
rights organization 

in gathering data 
during an exhumation. 


Emerging scientific technologies help defend human rights 


Scientists leverage advancing tools to gather evidence and expand the capacity of human rights groups 


By Anne Q. Hoy 


Against a backdrop of summer heat and a constant roar of distant 
howler monkeys, a scientific analyst piloted a drone to collect data 
from a hillside in northern Guatemala. At his side, anthropologists 
affiliated with a regional human rights group painstakingly cleared 
soil and roots from human remains in a mass grave. 
“Remains contorted, overlapping, interlaced, a 
cruel, tragic mashup of Hieronymus Bosch and H.R. 
Giger,” noted Jonathan Drake, senior program asso- 
ciate of the American Association for the Advance- 


“All of these 


first offered its scientific expertise and technical and analytical 
skills to assist in the search and exhumation of mass graves in Gua- 
temala. The collaborations went from “proof of concept, to training 
on data processing, to capacity building, to implementation in a 
real-world context,” said Drake. 

AAAS's use of drones for geospatial documentation in Guate- 
mala grew out of an earlier alliance with EQUITAS, an independent, 
nonprofit organization of scientific investigators 
of human rights violations in Bogota, Colombia. 

The organization sought help locating remains of 
victims of forced disappearances during Colombia's 


ment of Science's een die ha technologies are Paes se en , 
summoning images from -an -century ° asked for help with a suspected mass 
artists to describe the nightmarish remnants of an sort of converging grave in a cloud-covered, mountainous region likely 
ona eee to tesauentane ae after in a way that Acta to ae While ae 
, during Guatemala’s lengthy civil war. 5 mote sensing collects data without requiring a visi 
Multiple skeletons were exhumed. Clothing with Is really very and some types can penetrate cloud cover, high- 
burnt edges stuck to the bones of some. A blindfold effective ” resolution optical imagery offers better chances of 


encircled a skull. Leg bones bore evidence of a child. 
Those were among the observations Drake shared 
after maneuvering a commercial-grade drone at 
specific angles for optimal data collection and documentation. 
Drake and the representatives of the Guatemalan Forensic An- 
thropology Foundation (FAFG) each used drones to collect several 
hundred overlapping photographs documenting the location of the 
mass grave and to record each step of the exhumation process. 
AAAS and FAFG have worked together on six projects since AAAS 
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Jonathan Drake, AAAS 


capturing data from locations blanketed by clouds. 

It was in Colombia where AAAS first used and tested 
a drone to collect data from the suspected mass grave 
site. The site turned out to have no graves, but the data gathered al- 
lowed AAAS to produce a comprehensive map of the canopied jungle, 
providing EQUITAS with its first detailed survey of the region and a tool 
for planning future digs, Drake said. 

The AAAS geospatial project has evolved since its establishment 
in 2005 by putting technologies to work as they emerge. Drake and 
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earlier project participants have incorporated emerging technological 
advances into scientific collaborations with human rights practitioners 
around the globe. 

Lars Bromley, a former AAAS staff member and participant in 
the geospatial project’s collaborations with human rights organiza- 
tions, took on one of AAAS's initial initiatives. In an alliance with 
Amnesty International, Bromley used available technology and 
applied his background in geospatial analysis to scrutinize satellite 
imagery of four communities in Zimbabwe. 

The government asserted that homes in the communities were 
illegal and launched a demolition campaign that organizations con- 
sidered to be human rights violations. AAAS analyzed the destruc- 
tion and removal of more than 5000 homes captured by data that 
were provided to human rights lawyers in Zimbabwe. 

Leveraging drones to gather evidence that can later be used in 
courts is just one of the emerging technologies AAAS is now testing 
and using in its alliances with human rights organizations, particu- 
larly those that lack significant resources. 

Scientific advances in machine learning and artificial intelli- 
gence are being studied and, in some cases, tested for their ability 
to capture and analyze copious amounts of data. More recently, 
blockchain, microdrones, and nanodrones are being recognized as 
technologies that could assist in human rights investigations. 

Blockchain technology, Drake noted, could be useful in establish- 
ing a chain of custody for scientific evidence, as well as providing ver- 
ifiable provenance of digital data necessary in human rights cases. 

“They all fit together. Artificial intelligence fits with satellite im- 
agery in terms of classifying images, and it is also critical in linking 
hundreds and hundreds of images collected by a drone into a 3-D 
model,” Drake explained. “You can’t do that sort of thing without 
machine learning.” 

Insights gleaned from the drone test in Colombia were put to 
work in Guatemala. The drone's effectiveness in data gathering 
opened the door to AAAS'’s eventual ability to produce 3-D models 
from high-resolution images, assign global coordinates, map large 
areas, and conduct such analyses relatively quickly. 

During AAAS's first collaboration with FAFG in Guatemala, Drake 
used a drone to collect extensive photographic data and later 
transformed the photographs into a 3-D representation using a 
computer-assisted process known as photogrammetry. The pro- 
cess permits photographs of a single skeleton to be merged with 
images of subsequent exhumations, producing a dynamic model 
of amass grave’s human remains and contents. The model offers 
an integrated 3-D image of all or any part of the scene. 

Data collection by drones can show the precise orientation of 
one skeleton in relation to another—such as one facing another—a 
heartbreaking and rarely noticed view at a site during an exhuma- 
tion since skeletons are most often removed separately. 

Removal is a laborious process that can consume a day for a 
single extraction. Precise documentation of each exhumation is 
required. The scientific methods employed to collect data also 
must be described. 

“All of these technologies are sort of converging in a way that is 
really very effective and has the potential of being really effective in 
promoting human rights around the world,” noted Drake. 

The AAAS Scientific Responsibility, Human Rights and Law Pro- 
gram released a report in July examining the lessons learned 
by the organization in providing “geospatial analysis in a human 
rights context.” 

The report includes reviews of dozens of legal cases in which 
geospatial technology provided evidence used in international crimi- 
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nal prosecutions brought before the International Criminal Court; 
conflict-specific courts in Yugoslavia, Sierra Leone, and Cambodia; 
and human rights courts in Europe and Central and South America. 
Among its recommendations, the report calls on the judicial 
branch to appoint independent scientific advisers for cases that 
involve highly technical or specialized research; geospatial ana- 
lysts to document where and how their findings were collected and 
explain the scientific methods used; and human rights groups to use 
independent, experienced analysts. It urges government agencies 
to safeguard metadata and preserve the chain of custody of data, 
which are vital to the legal community. It also advises private-sector 
satellite data providers to protect their images from manipulation. 
Bromley, who now serves as a principal analyst and research 
adviser for a United Nations institute and a satellite operations pro- 
gram, expressed surprise about how often fundamental education 
about geospatial technologies continues to be required by human 
rights organizations. Over time, Bromley said, he has “come to 
accept that capacity building is a slow and never-ending process.” 
AAAS integrates extensive training in the collection of evidence 
from emerging technologies in its work with human rights groups. 
“Now, 10 years later, these technologies are firmly embedded in 
the human rights landscape and are in relatively common use,” said 
Bromley, referring to the geospatial technologies that were actively 
used when the program began and the pathway AAAS continues to 
pursue with emerging technologies. 
“We really did manage to take these from an infant technology to 
something useful and valued,” he added. 


AAAS annual election: 
Preliminary announcement 


The 2018 AAAS election of general and section officers is 
scheduled to begin in October. All members will receive a ballot 
for election of the president-elect, members of the Board of 
Directors, and members of the Committee on Nominations. 
Additionally, members registered in sections (up to three) will 
receive ballots for the specified section elections. Biographical 
information for the candidates will be provided along with bal- 
lots. The general election slate is listed below. The list of section 
candidates can be viewed at www.aaas.org/annual-election. 


Notice to our 
international members: 
In an effort to conserve resources, 


f you would like to request a 
special paper ballot, please 
send an email with your name 


AAAS will be sending electronic 
election ballots to our non-U.S.- 
based members. In order to 
ensure you receive your ballot, 
please make sure your email is 
up-to-date with AAAS by logging 
on to www.aaas.org. 1) Click on 
“Member Login” (if you have not 
yet created an account, you will 
be prompted to do so); 2) After 
you log in, click on the red “My 


Profile” button in the upper right- 


hand corner of the page; 3) Click 
on “Edit My Contact Informa- 
tion” in the left-hand side bar; 4) 
Update your email and click on 
the “Save” button. 


Published by AAAS 


and address with your request 
to elections@aaas.org. 


General Election Slate 
President-Elect: 

Jared D. Cohon, Carnegie Mellon 
University; Claire M. Fraser, Univer- 
sity of Maryland School of Medicine 


Board of Directors: 

Ann Bostrom, University of 
Washington; Maria Klawe, Harvey 
Mudd College; Peter R. MacLeish, 
Morehouse School of Medicine; 
Griffin P. Rodgers, National In- 
stitute of Diabetes and Digestive 
and Kidney Diseases 
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Federal research funding aims to ease societal challenges 


A key goal is to spread access to scientific excellence and federal research funding nationwide 


By Anne Q. Hoy 


Half of all federal research funding in the United States goes to re- 
cipients based in six states and the District of Columbia, leaving the 
other 50% of funding split among those in the remaining 44 states, 
the National Science Foundation’s annual Survey of Federal Funds 
for Research and Development shows. 

The concentration of geographic funding to primary recipients in 
California, Maryland, Massachusetts, New York, Texas, Virginia, and the 
District of Columbia was cited by Kei Koizumi, a senior science policy 
adviser for the American Association for the Advancement of Science, 
during a panel presentation exploring the evolution of federal research 
funding in the United States, France, Japan, and other Organization for 
Economic Co-operation and Development member countries. 

“We have abundant research that shows the majority of U.S. stu- 
dents go to school in their home states,” said Koizumi during the pre- 
sentation on 13 July. “So, if research funding is not happening in their 
state, then they are missing out on an opportunity to participate in our 
science and technology enterprise.” 
Concentrated research funding distri- 
bution levels, he added, deny states 
economic development, growth, and 
jobs that the system develops. 

The presentation was held at the 
biennial EuroScience Open Forum 
2018 in Toulouse, France, a gathering 
of more than 3000 scientists, innova- 
tors, policy-makers, and business 
representatives 9-14 July to discuss 
scientific research, innovation, and 
science policy issues. AAAS CEO 
Rush Holt moderated a session on 
science diplomacy and AAAS staff 
highlighted activities of AAAS’s Cam- 
bridge, U.K., office and the online, 
global news service EurekAlert! 

In recent years, geographic funding 
concentration levels in the United 
States have remained fairly consistent. 
Yet the country’s leading federal re- 
search funding institutions have been testing experimental programs 
to spread federal research funding more equally across the country 
to address economic and social inequities. 

Science and engineering research funding programs are searching 
for ways to provide university students in every state an opportunity 
to search for knowledge, extend scientific excellence, and, in so doing, 
ensure that the system tackles larger societal issues, said Koizumi. 

“It is important, both politically and socially, to address inequalities 
on multiple dimensions, and science funding is not exempt from that 
imperative,” Koizumi said. “We have seen that competitive research 
funding mechanisms, left to their own devices, can result in inequali- 
ties. They can perpetuate other inequalities that exist in society.” 

Adjusting the funding system to support multiple societal objec- 
tives as it also seeks to produce scientific excellence is not easy, 
noted Koizumi in his session on “Supporting long-term research in 
a world of sudden change: The evolution of research and funding in 
current financial and political contexts.” 

“It is, of course, a common insight now to see that the U.S. scien- 
tific workforce does not look like the U.S. population, and so diversity 
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Kei Koizumi (right) fields questions about scientific research funding 
mechanisms during a break at biennial ESOF in Toulouse, France. 


and inclusion are important considerations for how we support the 
U.S. scientific enterprise,” added Koizumi. 

To overcome impediments, the National Science Foundation, the 
Departments of Energy and Agriculture, and NASA have established 
programs under the Established Program to Stimulate Competi- 
tive Research, or EPSCoR, which was established in 1978 to enable 
universities across the country to compete for federal research fund- 
ing. The National Institutes of Health began a similar program 25 
years ago known as Institutional Development Awards, or IDeA. Both 
programs continue to grow. 

For two decades, AAAS has supported more than 30 states by 
providing 151 assessments of more than $1.2 billion in research 
projects funded by NSF’s EPSCoR and NIH’s biomedical research 
IDeA programs. AAAS’s Research Competitiveness Program, or 
RCP, conducts the work and provides peer-to-peer insights from 
independent U.S. experts and, more recently, quantitative evalua- 
tions of projects. In addition, RCP is now working on a NSF study to 
devise a framework for measuring academic research excellence 
and competitiveness for EPSCoR and 
potentially other NSF programs. 

“Our programs strengthening 
STEM ecosystems within the U.S. 
have parallels to national STEM 
initiatives in other countries,” said 
Charles Dunlap, RCP’s program di- 
rector. “While we continue to support 
institutions in the U.S., institutions 
abroad are increasingly contacting 
AAAS for support as well.” 

A range of other collaboration 
models also have emerged. One is 
organized around national objectives 
such as improving health care, ad- 
dressing climate change, or expand- 
ing manufacturing opportunities. 
Other competitive funding initiatives 
promote cross-sector collaborations 
that align private businesses with 
research universities and federal 
research laboratories and interna- 
tional collaborations that match scientific research groups with global 
partners to pursue shared scientific goals. 

Competitive research funding endeavors also focus on “high- 
risk, high-reward research, or potentially transformative research,” 
said Koizumi, in a drive to offset the tendency among experienced 
researchers in a highly competitive funding arena to pitch less risky 
and shorter-term proposals. 

Since World War Il, the federal research funding system has helped 
the United States become the world’s leader in science and engi- 
neering innovation. With time, though, system flaws have emerged, 
including stress on success rates due to the growing number of re- 
search proposals that fall short of funding, a development that raises 
the cost of the scientific review process. The system also has failed 
to expand the ranks of underrepresented minorities and women in 
the scientific enterprise. 

In addressing the state of today’s competitive research funding 
system, Koizumi said, “With careful attention we can use competitive 
research funding to attempt to address the challenges of inequalities 
both inside the scientific enterprise but also with our society at large.” 
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very revolution in technology is followed by an 
explosion of new scientific knowledge. Biology 
is no exception. Van Leeuwenhoek’s micro- 
scope jump-started microbiology, polymerase 
chain reaction revolutionized molecular biol- 
ogy (see Editorial in this issue), and single-cell 
imaging and sequencing approaches remark- 
ably advanced immunology, cancer research, 
developmental biology, and beyond. 

Recent years have witnessed disruptive innova- 
tions in biotechnology. Researchers have never been 
equipped with more powerful tools to probe biology. 
Breakthroughs in electron microscopy allow biomo- 
lecular complexes to be visualized with high resolu- 
tion, offering insight into the workings of these mo- 
lecular machines. Ingenious methods to break the 
diffraction limit in microscopy enable single mol- 
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ecules to be observed and tracked in single cells. Not 
only do we see cells in ever-greater detail, but with 
CRISPR-mediated gene editing techniques, biologists 
can precisely and easily manipulate cellular genomes 
of diverse organisms. As we gain understanding of 
biological networks, tools including those based on 
CRISPR give us the ability to record biological events, 
to detect and treat disease, and to engineer plants 
with new traits and greater productivity. 

When biologists encounter problems that cannot 
be solved by currently available technologies, shared 
creativity between researchers drives the develop- 
ment of better and smarter tools. New technologies, 
in turn, push the frontier of biology. This synergy has 
been moving our society and humanity forward, and 
the advent of artificial intelligence is likely to speed 
up this cycle of discovery. 
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REVIEW 


CRISPR-Cas guides the future 
of genetic engineering 


Gavin J. Knott! and Jennifer A. Doudna?”***>* 


The diversity, modularity, and efficacy of CRISPR-Cas systems are driving a biotechnological 
revolution. RNA-guided Cas enzymes have been adopted as tools to manipulate the genomes 
of cultured cells, animals, and plants, accelerating the pace of fundamental research and 
enabling clinical and agricultural breakthroughs. We describe the basic mechanisms that set 
the CRISPR-Cas toolkit apart from other programmable gene-editing technologies, highlighting 
the diverse and naturally evolved systems now functionalized as biotechnologies. We discuss 
the rapidly evolving landscape of CRISPR-Cas applications, from gene editing to transcriptional 
regulation, imaging, and diagnostics. Continuing functional dissection and an expanding 
landscape of applications position CRISPR-Cas tools at the cutting edge of nucleic acid 


manipulation that is rewriting biology. 


esearchers have long pursued a means of 

efficiently manipulating DNA and RNA to 

tailor genes and their regulation. Genetic 

perturbation enables scientists to probe 

gene function or correct mutations but is 
often intractable due to a technical challenge: 
site-specific nucleic acid targeting. Targeted gene 
editing has been achieved by induced double- 
stranded DNA (dsDNA) breaks in eukaryotic chro- 
mosomes (1), but with challenging technologies 
based on engineering direct protein-DNA recog- 
nition. The history recounting the discovery, de- 
velopment, and application of such engineered 
nucleic acid binding proteins—including zinc 
fingers, TALENS, and meganucleases—is rich 
in noteworthy scientific feats (2). Over the past 
6 years, however, transformative discoveries have 
shaped the CRISPR (clustered regularly inter- 
spaced short palindromic repeats) Cas (CRISPR- 
associated) toolbox for genetic manipulation on 
the basis of simpler RNA-guided DNA recogni- 
tion. This toolbox now provides important scien- 
tific opportunities for curing genetic diseases 
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and engineering desirable genetic traits, as well 
as new approaches to live-cell imaging, high- 
throughput functional genomic screens, and 
point-of-care diagnostics. In this Review, we 
summarize the basic mechanisms of RNA-guided 
single-component CRISPR-Cas systems and their 
general applications. The basis for the CRISPR 
revolution goes beyond inherent programma- 
bility, lending itself to the naturally evolved 
diversity of systems that extend CRISPR-based 
technology beyond precision gene editing. To 
capture the broadened landscape of Cas appli- 
cations and their impact as a force for revolu- 
tion in molecular biology, where appropriate, 
we refer readers to recent reviews for a more 
detailed discussion. 


Diverse RNA-programmable 
CRISPR-Cas enzymes 


CRISPR-Cas systems provide microbes with RNA- 
guided adaptive immunity to foreign genetic 
elements by directing nucleases to bind and 
cut specific nucleic acid sequences (3-5) (Fig. 1). 
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Through a process termed adaptation, microbes 
capture snippets of foreign genetic elements and 
incorporate them into their genomic CRISPR 
array. Transcription of CRISPR arrays creates 
CRISPR RNAs (crRNAs) that bind to Cas nucleases 
and provide specificity by base-pairing with 
target nucleic acids (4, 5). Among the diverse 
naturally evolved CRISPR-Cas systems, those 
designated class 2 constitute a single large RNA- 
guided Cas nuclease that mediates target inter- 
ference or cleavage [reviewed in (6)]. 

The class 2 type II DNA-targeting endonuclease 
Cas9 (the first Cas effector to be harnessed 
for genome engineering) has several properties 
that ensure precise and efficient editing (Box 1A 
and Fig. 2A). Cas9 assembles with only the in- 
tended guide RNA through specific recognition 
of the crRNA and its interaction with a trans- 
activating crRNA (tracrRNA). In addition, the 
dual crRNA-tracrRNA can be fused into a chi- 
meric single-guide RNA (sgRNA), thereby creat- 
ing a two-component system composed of Cas9 
and its sgRNA (7). Finally, stable binding to tar- 
get DNA adjacent to a specific motif [protospacer 
adjacent motif (PAM) (8, 9)] with the correct 
nucleotide sequence acts as a switch, triggering 
Cas9 to introduce a dsDNA break (7). Scientists 
worldwide have deployed Cas9 because of this 
switchable nuclease activity and the ease of re- 
directing the enzyme by altering the sgRNA- 
targeting region (or spacer sequence) (10-12). 

Although Streptococcus pyogenes Cas 9 (SpCas9) 
remains the most commonly used Cas effector, it 
is not alone in the evolutionary arms race against 
mobile genetic elements. Bacteria and archaea 
have evolved numerous functionally distinct 
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Fig. 1. CRISPR-Cas adaptive immunity. (A) Foreign genetic elements 
are acquired by Casl-Cas2 and integrated into the CRISPR array in 

a process broadly termed adaptation. (B) The CRISPR array and 
associated Cas proteins are expressed. The CRISPR array is processed 
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and Cas effector nucleases associate with a crRNA to form a 
surveillance complex. (C) The Cas effector nucleases target foreign 
genetic elements complementary to their crRNA, leading to target 
interference and immunity. 
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Fig. 2. Schematic of class 2 CRISPR-Cas systems. (A) Class 2 type II 
CRISPR-Cas9 shown schematically with an sgRNA (blue) encoding a spacer 
(red) bound to a target dsDNA (black) proximal to a PAM (teal). Correct 
base-pairing activates the HNH and RuvC nuclease domains, cleaving both 
strands (scissors). (B) Class 2 type V CRISPR-Cas12a shown schematically 
with a crRNA (blue) encoding a spacer (red) bound to a complementary 


CRISPR-Cas systems that maintain the program- 
mable characteristics key to the success of SpCas9. 
Scientists have tapped the evolutionary diversity 
of type II systems, incorporating divergent homo- 
logs and engineered variants of SpCas9 into an 
arsenal of genome editors. At the tail end of 2015, 
class 2 systems expanded to include a number of 
candidate systems, which were later designated 
type V CRISPR-Cas12a (formerly Cpfl) and type 
VI CRISPR-Cas13a (formerly C2c2) (Box 1, B and 
C, and Fig. 2, B and C). Today, SpCas9 shares the 
spotlight with a diversity of Cas9 homologs, DNA- 
targeting Cas12, and RNA-targeting Cas13, all of 
which are programmable RNA-guided nucleases 
[reviewed in (13)]. It is this inherent program- 
mability present in a diversity of naturally evolved 
systems that extends CRISPR-Cas applicability 
beyond precision genome editing. 


Applications of Cas-mediated 
genome editing 


Although the scope of Cas application 
has broadened, precision genome engi- 
neering remains at the forefront of the 
CRISPR revolution. Cas9 and Cas12a are 
RNA-guided nucleases that can induce 
genome editing by triggering dsDNA 
break repair at a specific site (Fig. 3). 
Editing occurs after cellular DNA repair 
pathways resolve the break by nonhomol- 
ogous end joining (NHEJ), which can 
introduce small insertions or deletions, 
or by homology-directed repair (HDR) 
with a donor sequence at the site of the 
dsDNA break [reviewed in (74)]. 

As tools for precision genome engi- 
neering, Cas9 and Cas12a work in a wide 
range of cell types and organisms. Cas- 
mediated gene editing has prompted 
genome-wide screens to probe basic 
biological function, in addition to iden- 
tification and validation of potential drug 
targets in complex heritable diseases [re- 
viewed in (15)]. Agricultural applications 
of Cas-nucleases [reviewed in (16)] have 
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produced modified crops that now have a stream- 
lined path to the market (17). In the clinic, Cas- 
nucleases allow diseases with a known genetic 
basis to be treated and, in an era of high-throughput 
DNA sequencing, personalized to a patient’s dis- 
ease etiology. Examples include gene editing to 
correct mutations or induce skipping of defective 
exons in Duchenne muscular dystrophy (DMD), 
strategies that are already showing efficacy in 
animal models (18, 19). Cas9 has also been used 
to inactivate defective genes that underlie neuro- 
logical diseases, including amyotrophic lateral 
sclerosis (20) and Huntington’s disease (27). 
Scientists have used Cas9 to eliminate an entire 
chromosome in aneuploid human pluripotent 
stem cells (22), to inactivate an endogenous retro- 
virus in pigs (23), and to engineer T cells as a 
prelude to developing advanced immunotherapies 
to target cancers (24). Furthermore, Cas9 has 


Box 1. Crash course in class 2 CRISPR-Cas systems. 


(A) Class 2 type II CRISPR-Cas systems target dsDNA 
using the effector nuclease Cas9, a crRNA, and tracrRNA 
[crRNA-tracrRNA fusion creates the sgRNA (7)]. Cas9 binds 
A sequence complementary to the sgRNA spacer 
nt to a PAM. Cas9 senses correct base-pairing, thus 
activating its RuvC and HNH nucleases to cleave the 
nontarget and target DNA strands. (B) Class 2 type V 
R-Cas systems, specifically subtype Casl2a, target 
ssDNA and dsDNA using the effector nuclease Casl2a 
(formerly Cpfl) guided by a single crRNA (75). Casl2a binds 
A sequence complementary to the crRNA spacer, 
adjacent to a PAM for dsDNA targets. Casl2a senses correct 
base-pairing to activate its RuvC nuclease for general 
ssDNase activity, cleaving the nontarget and target DNA 
strands and trans-ssDNA substrates. (C) Class 2 Type VI 
CRISPR-Cas systems, specifically subtype Cas13a, target 
ssRNA using the effector nuclease Casl3a (formerly C2c2) 
guided by a single crRNA (46, 47). Casl3a binds to a ssRNA 
sequence complementary to the crRNA spacer. Casl3a 
senses correct base-pairing to activate the HEPN nuclease for 
general ssRNase activity. See also Fig. 2. 
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dsDNA target (black) proximal to a PAM (teal). Correct base-pairing 
activates the RuvC nuclease, cleaving both strands (scissors) with 
multiple-turnover general ssDNase activity (arrow). (C) Class 2 type VI 
CRISPR-Cas13a shown schematically with a crRNA (blue) encoding a spacer 
(red) bound to a complementary RNA target (black). Correct base-pairing 
activates HEPN nuclease general ssRNase activity (arrow). See also Box 1. 


allowed targeting of the genetic basis for sickle 
cell disease (25) such that there are now esta- 
blished protocols for the correction of genetic 
defects in this cell type (26). Beyond such somatic 
cell editing, the potential to correct genetic muta- 
tions in human embryos is on the horizon, raising 
societal and ethical questions about creating herit- 
able changes in the human germline (27). 

However, it is important to note that precision 
editing remains challenging, particularly given 
competing repair outcomes (such as NHEJ) re- 
straining the efficiency of more desirable HDR 
repair outcomes (28). An alternative approach uti- 
lizes Cas effectors fused to base editors, limiting 
unintended edits and eliminating the requirement 
for repair templates. Distinct from DNA cleavage 
and repair, nickase Cas9 (nCas9)-mediated base 
editing carries a single-base editor to a target locus 
facilitating base conversion without dsDNA cleav- 
age (Fig. 3) [reviewed in (29)]. Recently, 
the toolbox of base editors expanded to 
include a laboratory-evolved deaminase 
enabling nCas9-mediated single-base 
editing to catalyze A-T to G-C transitions 
(30). The existing Cas9-mediated base 
editors now enable researchers to create 
any of the four possible transition muta- 
tions at a specific genomic locus (30-33). 
Although single-base editors provide the 
potential to correct disease-causing muta- 
tions without inducing a dsDNA break, 
the technology requires further devel- 
opment to limit off-target editing. Looking 
forward, the next generation of Cas- 
mediated genome editors will likely in- 
clude base editors, ideally with base-editing 
activity conformationally coupled to 
Cas9 target DNA binding. 


Transcriptional regulation 
with dCas9 


Cas9 has proven to be a modular plat- 
form with functionally distinct DNA 
binding and nuclease activities. Decou- 
pling DNA binding from the enzymatic 
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activity of Cas9 by mutating the nucle- 
ase domains creates catalytically deficient 
Cas9 (dCas9), a functional scaffold for 
recruiting protein or RNA components to 
a specific locus to perturb transcription 


Cells and organisms 


without permanently altering DNA [re- DNA RNA 

viewed in (34, 35)] (Fig. 3). The use of Central Transcription 

dCas9 has revolutionized functional ge- dogma NV INVIINVG 

netic screening by enabling specific, 

rapid, and multiplexed genetic knock- i i 

downs in a range of cell types, including oe. 

immune cells and neurons (36, 37). These CRISPR (Casi2) opp : 

advances with dCas9 highlight the prac- system Pace 

ticality of genomic perturbation without (Base-editor) | Repressor ' = 

the risk of DNA damage, an attribute that cutee wan 

has motivated studies in model systems to : : emits 
drive therapeutic development. For exam- Vv Sf v S v v 
ple, dCas9 fused to TET1, a demethylase, 

targeted to the dysregulated FMRI locus Targeted Genome editing DNAimaging Activation Repression RNAimaging RNA editing Knockdown 
reversed the phenotype of fragile X syn- outcome 


drome in neurons and mouse models (38). 
Gain-of-function studies have successfully 
implemented a modified dCas9 target gene 
activation system to treat type 1 diabetes, 
acute kidney injury, and murine muscular 
dystrophy (39). The ability to conduct sup- 
pressor screens and synthetic lethal screens 
in virtually any cell type offers the poten- 
tial to discover gene functions, effector 
interactions, and pathways at a pace never before 
possible. However, challenges remain: dCas9- 
effector fusions have complex off-target effects due 
to the fused catalytic domains targeting neighbor- 
ing or even unrelated loci. Additionally, unpre- 
dictable locus-specific effects on chromatin, and 
in turn transcription, can confound analysis and 
obscure causality (40). Future work should ap- 
propriately control for unpredictable locus-specific 
effects with systematic validation and should 
aim to further improve specificity. 


Posttranscriptional engineering with 
RNA-targeting Cas 


As an alternative to permanent genetic alteration, 
Cas effectors can be applied to transiently perturb 
the transcriptome through direct RNA targeting 
(Fig. 3). Engineering SpCas9 to create a program- 
mable RNA-targeting system with the use of a 
PAM-presenting oligonucleotide (PAMmer) (47) 
ushered in applications for RNA-targeting with 
Cas9 (RCas9). Targeting RCas9 to RNA can elim- 
inate pathogenic RNA foci, rescue mRNA splicing 
defects, or attenuate polyQ-containing protein pro- 
duction from RNAs with trinucleotide CAG repeats 
(42). To date, the arsenal of RNA-targeting Cas9s 
has expanded to include related homologs with 
programmable RNA-targeting activity that is 
PAMmer independent (43-45). Given its success, 
Cas9 lends itself to further development for post- 
transcriptional engineering, such as fusions to 
single-base RNA modifiers to achieve site-specific 
RNA modifications. 

Cas13 has also emerged as a highly versatile 
tool for RNA targeting. Reconstituting Cas13a 
in Escherichia coli (46) and in vitro (46, 47) 
established type VI systems as an RNA-guided 
general ribonuclease (RNase). Cas13a has been 
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employed in vivo as a tool for specific knock- 
down in mammalian (48) and plant cells (49). 
Evolutionarily and functionally related to Cas13a, 
Cas13b enzymes have programmable RNase ac- 
tivity that has been functionalized for RNA inter- 
ference and RNA editing in mammalian cells 
(50, 51) (Fig. 3). More recently, CRISPR-Cas13d 
was identified (52, 53) and reconstituted for mod- 
ulating splicing in vivo (52). RNA-targeting sys- 
tems such as Cas9 and Cas13 support targeted 
RNA-guided research in addition to clinical ap- 
plications akin to antisense oligonucleotide ther- 
apies for the treatment of acute non-Mendelian 
pathologies, avoiding the risks associated with 
permanent genetic modification. However, fu- 
ture studies are needed to determine how RNA- 
targeting Cas-effectors interface with a structured 
or protein-occluded RNA landscape and how 
trans-RNA cleavage by Cas13 is attenuated in vivo. 


Programmable nucleic acid imaging 


Correct spatiotemporal localization is critical 
to the function of specific genomic loci, mRNAs, 
and noncoding RNAs, with dysregulated molec- 
ular localization strongly implicated in disease. 
Current technologies for live-cell imaging of ge- 
nomic loci or nascent RNA are limited by the 
need for protein engineering or the introduction 
of targetable sequences into a transcript of in- 
terest. Leveraging dCas9, researchers have imaged 
repetitive genomic loci in live cells using dCas9 
fused to fluorescent reporters [reviewed in (54)] 
(Fig. 3). Exploiting the stringency of dCas9 PAM 
recognition, a method was developed that allows 
high-resolution single-nucleotide polymorphism 
CRISPR live-cell imaging of DNA loci (55). How- 
ever, widespread use of dCas9 to study localiza- 
tion of specific genomic loci has been limited by 
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Fig. 3. CRISPR-Cas systems allow genetic manipulation across the central dogma. From left to right, 
Cas9 and Cas12a are used for inducing dsDNA breaks for genome editing. nCas9 can be fused to base editors to 
modify nucleotides in dsDNA for genome editing without introducing a dsDNA break. dCas9 can be fused to 
transcriptional activators, repressors, or epigenetic modifiers to regulate transcription. Cas9 and Casl3a can be 


used for targeted RNA interference. Casl3a fused to base editors can be used to modify nucleotides in RNA. 
dCas9 or dCasi13a can be fused to green fluorescent protein (GFP) to visualize DNA or RNA. 


a low signal-to-noise ratio at nonrepetitive geno- 
mic sequences. One strategy to overcome inade- 
quate signal-to-noise ratio involves appending 
multiple bacteriophage MS2 operator RNA hair- 
pins (MS2 motifs) to the sgRNA (56). Tandem 
MS2 motifs act as high-affinity binding sites 
recruiting multiple MS2 motif binding proteins 
fused to a fluorescent reporter, effectively ampli- 
fying the signal to allow detection of a single 
dCas9-sgRNA bound to DNA in vivo (56). Lever- 
aging RNA-targeting RCas9 has allowed re- 
searchers to track RNA in live cells (57), thus 
making it possible to visualize clinically relevant 
repeat expansion-containing transcripts (42) 
(Fig. 3). With the growth of the RNA-guided 
RNA-targeting toolbox, RNA imaging tools now 
also include catalytically deficient Cas13a (dCas13a) 
(48). Though both RCas9 and dCasil3a show 
promise when targeted to repetitive elements, 
further development is required to realize either 
platform as a reliable tool for low-abundance 
transcripts lacking in repetitive sequences. Fur- 
thermore, it is unclear whether localizing large 
exogenous ribonucleoproteins (RNPs) to tran- 
scripts might perturb cellular processes. 


Nucleic acid detection and diagnostics 


The RNA-guided nuclease activities of Casl3a 
and Cas12a have driven development of innova- 
tive tools for nucleic acid detection. For both 
Cas13 and Cas12a, which are functionally distinct 
from Cas9, a target nucleic acid (or activator 
RNA or DNA) activates general multiple-turnover 
nuclease activity through correct base-pairing to 
the guide RNA (Box 1, B and C, and Fig. 2, B and 
C). Leveraging this switchable nuclease activity, 
Cas13a was first functionalized as a tool for de- 
tecting target RNA transcripts of interest in a 
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pool of RNA by detecting its RNase activity (47). 
Expanding on this work, SHERLOCK (Specific 
High-Sensitivity Enzymatic Reporter UnLOCKing) 
was developed as a platform incorporating pre- 
amplification of the input material to create a 
tractable paper-based assay with improved sen- 
sitivity (58). Biochemical dissection identified 
that divergent Cas13a homologs have discrete 
crRNA and substrate preferences enabling orthog- 
onal use to simultaneously detect two different 
transcripts (59). Similar dissection of Cas13b ho- 
mologs revealed substrate preferences that sup- 
ported expansion of the SHERLOCK platform, 
now SHERLOCKv?2, to simultaneously detect 
dengue and Zika virus single-stranded RNA 
(ssRNA) (60) in a readily deployable format (67). 
Analogous to Cas13, Cas12a has evolved a func- 
tionally convergent switchable general nuclease 
that targets ssDNA (62). Exploiting this activity, 
DETECTR (DNA endonuclease-targeted CRISPR 
trans reporter) was developed as a CRISPR-based 
DNA detection and diagnostic platform (62). Cou- 
pled with isothermal pre-amplification, DETECTR 
was shown to rapidly and accurately detect clin- 
ically relevant types of human papillomavirus 
(62). SHERLOCKv2 integrated Cas12a-based DNA- 
targeting to detect either Pseudomonas aeruginosa 
or Staphylococcus aureus DNA targets in par- 
allel to detection of RNA targets by Cas13a and 
Cas13b (60). Akin to Cas13, tapping the functional 
diversity of Cas12 systems may yield functional 
variants that enable further development of DNA- 
based diagnostics. Looking ahead, the detection of 
a specific transcript using CRISPR-Cas is rapid 
and readily adaptable in the clinic, setting the 
stage for inexpensive point-of-care diagnostics. 


Specificity and delivery of CRISPR-Cas 


Unintended binding, modification, and cleavage 
of nucleic acids pose a challenge to all technolo- 
gies for genetic manipulation. Compared with the 
side effects caused by off-target interactions of 
small-molecule drugs or antibody therapeutics, 
off-target Cas nuclease activity is especially dele- 
terious because of the permanence of genome 
editing. Indeed, this further reinforces the neces- 
sity for nuclease specificity and targeted delivery. 
Researchers have made considerable advances 
in evolving and engineering Cas enzymes (63, 64) 
or sgRNAs (65) to improve nuclease specificity. In 
addition, robust methods for predicting targeting 
outcomes (66) and achieving spatiotemporal gene 
regulation (67) provide comprehensive strategies 
to reduce off-targets. Beyond engineering the 
Cas nuclease, researchers are also developing a 
deeper understanding of cellular DNA repair to 
improve the likelihood of achieving a desired 
editing outcome (68). 

Optimizing vehicles for efficient and specific 
delivery of the Cas payload remains a major ob- 
stacle, particularly in light of immune responses 
to sgRNA and Cas9 in humans (69, 70). Within the 
lab, researchers have a number of options (electro- 
poration, transfection, direct injection, and viral 
vectors) for delivering the DNA encoding the Cas 
payload, the ssRNA and mRNA encoding the Cas 
proteins, or preformed RNPs to cells ex vivo (71) 
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or within immune privileged environments (27). 
Unfortunately, many of these options cannot be 
broadly translated in clinical settings where the 
specific requirements for efficient in vivo delivery 
vary with disease etiology. Furthermore, the large 
size of Cas nucleases bound to their guide RNA pre- 
sents a challenge for packaging within viral-based 
vectors. One strategy to solve this problem is leverag- 
ing smaller related Cas homologs, or minimized 
systems that support packaging into viral vectors 
(72, 73). Alternatively, functionalized nanomaterials 
enable specific delivery to a cell type of interest. 
Recent studies have shown that directly injecting 
nanoparticles containing Cas9-sgRNA efficiently 
corrects the causative DMD mutation, leading to 
improved clinical phenotypes in mice (74). In all 
likelihood, the success of CRISPR-based therapeu- 
tics will depend on further development of suitable 
vehicles for delivering the Cas payload. 


Conclusions and future directions 


CRISPR-Cas based technologies provide an ac- 
cessible and adaptable means to alter, regulate, 
and visualize genomes, enabling biological re- 
search and biotechnological applications in a wide 
range of fields. CRISPR-Cas tools have vastly ac- 
celerated the pace of research, from understand- 
ing the genetics of previously unstudied organisms 
to discovering genes that contribute directly to 
disease. The field of Cas-based biotechnology is 
developing at a rapid pace, with multiple Cas9- 
based clinical trials in progress or beginning soon, 
the results of which will likely guide future use for 
somatic cell editing both ex vivo and in patients. 
Outside of the clinic, agricultural applications of 
CRISPR-Cas9 are already creating products for 
various markets, leading to recent rulings by the 
US. Department of Agriculture about their reg- 
ulation (17). This ever-expanding repertoire of ap- 
plications firmly positions the CRISPR-Cas toolkit 
at the cutting edge of genome editing and, more 
broadly, genetic engineering. 
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REVIEW 


Emerging applications for DNA 
writers and molecular recorders 


Fahim Farzadfard and Timothy K. Lu 


Natural life is encoded by evolvable, DNA-based memory. Recent advances in dynamic 
genome-engineering technologies, which we collectively refer to as in vivo DNA writing, have 
opened new avenues for investigating and engineering biology. This Review surveys these 
technological advances, outlines their prospects and emerging applications, and discusses the 
features and current limitations of these technologies for building various genetic circuits for 
processing and recording information in living cells. 


enomic DNA is an ideal medium for arti- 

ficial biological information storage because 

of its ubiquitous presence, durability, and 

compatibility with biological functions, 

especially as the throughput of DNA se- 
quencing has substantially increased along with 
drops in cost (7). With the advent of genome- 
editing technologies, we can now dynamically 
change genetic information and harness the vast 
capacity of genomic DNA for information pro- 
cessing and storage in living cells. These dynamic 
in vivo DNA-writing technologies have opened 
new avenues for investigating and engineering 
biology, ranging from building molecular re- 
corders and living biosensors for the longitudinal 
study of signaling dynamics in biological pro- 
cesses (2-6) to rationally designing genetic mem- 
ory elements and computation operations in living 
cells (6-10) to tracing cellular lineages during 
development and differentiation (17-13). Here, 
we first review the applications, prospects, and 
potential uses of these technologies in various 
biological and biomedical settings. We then outline 
current in vivo DNA-writing technologies, sum- 
marize the memory architectures and features 
that each of these technologies offers, and discuss 
their current limitations. 


DNA writers 


DNA writers are genetically encoded devices that 
enable targeted, dynamic, and recurring mod- 
ifications of DNA in living cells (2-7, 14). These 
modifications can take the form of targeted in- 
sertions, deletions, inversions, or base substitu- 
tion mutations and can serve as distinct DNA 
memory states (Fig. 1A). On the basis of the 
mutational outcomes, these devices can be broadly 
categorized into two classes: precise and pseudo- 
random writers. Precise DNA writers generate 
predetermined mutational outcomes, resulting 
in well-defined transitions between memory states 
in cell populations. Pseudorandom DNA writers 
generate targeted but stochastic mutational out- 
comes, resulting in unpredictable mutation signa- 
tures in cell populations. These two classes of DNA 
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writers offer different levels of encoding capacity 
and control over memory states and operations, 
making them suitable for different sets of appli- 
cations (summarized in Table 1). 


DNA-writing applications 

Molecular recording 

Many molecular events that occur in biological 
systems are transient and thus difficult to moni- 
tor and study within their native context. DNA 
writing can be used to create molecular recorders 
that capture these transient signals and stably 
encode them into the DNA of cell populations or 
individual cells in vivo and in situ (Fig. 1B, left). 
The accumulated mutations can then be retrieved 


“With the advent of 
genome-editing technologies, 
we can now dynamically 
change genetic information 
and harness the vast 
capacity of genomic DNA for 
information processing and 
storage in living cells.” 


by DNA sequencing or functional assays to infer 
information about the original signals, even after 
the original signals are gone. This principle con- 
verts living cells into recording devices that mem- 
orize the history of their own signaling dynamics 
into permanent DNA records, which in turn can 
provide longitudinal insights into biological pro- 
cesses in their native contexts, as opposed to snap- 
shots in time obtained by current approaches. 
Several strategies, including conditional trans- 
criptional or posttranscriptional activation of DNA 
writer components, can be used to couple the 
activity of a given DNA writer to signals of interest 
(Fig. 1B, bottom left). For example, by using signal- 
responsive promoters, information regarding the 
presence, duration, intensity, order, and timing of 
biological cues (such as metabolites and cytokines) 
or environmental cues (such as light, pollutants, 
exposure to phages, or changes in temperature) 
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can be recorded in DNA (2-6). Naturally occurring 
signal-responsive promoters could be linked to 
DNA writer activity and used as a proxy to record 
and study the dynamics of the corresponding sig- 
naling pathways. If desired, rational design or 
directed evolution could be used to decouple natural 
promoters from unwanted overlapping pathways 
(e.g., by removing binding sites of corresponding 
transcription factors) or to engineer synthetic 
promoters with altered response dynamics (15-17). 
Alternatively, conditional activation of DNA writ- 
ers in response to a desired signal can be achieved 
posttranscriptionally: for example, by implement- 
ing signal-dependent changes in conformation or 
interactions between DNA writer components. 


Basic research 


By offering an unprecedented ability to capture 
transient spatiotemporal molecular events in their 
native contexts, molecular-recording technologies 
could have broad utility across various disciplines 
(Fig. 1B, middle). For example, developmental biol- 
ogists could use these DNA recorders to study the 
dynamics of differentiation cues and developmental 
pathways. Cancer biologists could use these record- 
ers to study tumor development and to gain deeper 
insight into the cellular and environmental cues in 
tumor microenvironments that are involved in 
cancer heterogeneity. Immunologists could use 
these recorders to study signaling in immune cell 
maturation, memory formation, and immune re- 
sponses. Microbiologists could use these recorders 
to study signaling dynamics and molecular inter- 
actions within bacterial communities and biofilms. 

Various biological signals, ranging from small 
molecules to immunological cues to light, have 
been successfully recorded in both prokaryotic 
and eukaryotic cells (2-6). However, those re- 
cordings have been applied mainly to in vitro 
settings and relied on population-averaged read- 
outs (see Box 1). Future work is needed to im- 
prove these technologies for single-cell recording or 


Fig. 1. DNA-writing technologies and their 
emerging applications. (A) A schematic repre- 
sentation of a DNA writer (left) and mutation 
signatures generated by precise (middle) and 
pseudorandom (right) DNA writers. SO and S1 
indicate unmodified (memory state 0) and mutated 
(memory state 1) alleles, respectively. B1 to B5 
indicate random memory states 1 to 5 that are 
generated by pseudorandom DNA writers and could 
serve as distinct barcodes. (B) Schematic repre- 
sentation of a molecular recorder and strategies 
that can be used to couple its activity to signals 

of interest (left), along with examples of applications 
in basic research (middle) and biotechnology 
right). (C) Examples of evolutionary cellular 
engineering applications enabled by DNA writers. 
D) Various forms of logic and computation can be 
achieved by layering multiple precise recorders. 
E) Examples of strategies for high-throughput 
mapping of interactions or activities of variant 
ibraries by DNA writers. (F) Pseudorandom DNA 
writers can be used to develop dynamic in vivo 
genetic barcoding schemes that distinctively and 
progressively mark cellular lineages over time. 
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to demonstrate the transformative use of molecular 
recorders in live animals, where the longitudinal 
study of in situ biology is currently limited. Memory 
architectures that impose minimal fitness effects 
will be important for realizing the use of molecular 
recorders in challenging in vivo conditions. 


Living biosensors 


Nonbiological sensors are not optimized to inter- 
act with biological systems. Living cells, on the 
other hand, are useful chassis for hosting 
sensors that can respond to various biological 
cues. DNA-writing technologies can be used to 
create living biosensors for longitudinal health 
and environmental monitoring. For example, bac- 
terial cells endowed with disease biomarker 
sensors coupled with DNA recorders could be 
consumed orally, transit through the gastro- 
intestinal tract to record disease biomarkers, 
and report this information later when they exit 
the body (Fig. 1B, top right). Engineered human 
cells harboring molecular recorders could be de- 
ployed into the body to report on early signs of 
disease, such as cancer or neurodegeneration. 
Finally, engineered cells and animals equipped 


with recording capacities could be used to con- 
tinuously monitor and record the levels and ac- 
tivities of biological and environmental cues (such 
as toxins, heavy metals, metabolites, and light) 
without requiring artificial power supplies and 
in conditions and places that are not readily 
accessible to nonbiological sensors. Similar to 
basic research purposes, biosensing applications 
will need memory architectures with minimized 
fitness effects and extended recording capacities 
to achieve continuous and robust recording. 


Brain mapping 

Mapping the activities and connectome of neural 
circuits in the brain is one of the greatest chal- 
lenges of our time (Fig. 1B, bottom right). As an 
alternative to current imaging-based techniques, 
which suffer from trade-offs between resolution 
and throughput, DNA-based ticker tape circuits 
that allow for the dynamic logging of signals 
have been proposed for recording spatiotemporal 
neural activities (18). Although existing molecular- 
recording technologies offer temporal resolutions 
that are orders of magnitude longer than neural 
pulses (Box 1), they could potentially be used to 


Box 1. DNA memory features. 


Population-distributed versus single-cell recording 


Because of the probabilistic nature of DNA writing at the single-molecule level, a statistically sig- 
nificant number of recording substrates (i.e., DNA molecules) are required to achieve robust 
recording. All the molecular recorders described so far have utilized the distributed genomic DNA of 
cell subpopulations to achieve robust recording. Developing efficient writers and/or using these 
together with high-copy-number recording substrates could pave the way toward single-cell recording. 


Write cycles 


We define write cycles as the number of iterations in which new information can be added to a memory 
register encoded on a single molecule of DNA by a single DNA writer or recorder complex before the 
memory register becomes nonresponsive to that complex. With the use of base editing (6), stgRNA (4, 14), 
and Casl-Cas2 (3, 28) technologies, memory architectures with write cycles of >1 have been demonstrated. 


Recording capacity 


We define recording capacity as the number of distinct memory states that can be recorded (and 
practically retrieved) in the entire storage unit (the cell population for population-level recording or 
an individual cell for single-cell recording) by using a single DNA writer or recorder complex. 


Digital versus analog recording 


Depending on the writer efficiency and the potential memory states in the population (the number of 
memory states in each cell x the number of cells), two signal-recording regimes can be defined 
(Fig. 2D). Digital recording (a sharp, saturating increase in the mutation frequency in the population 
in response to an input) can be achieved when highly efficient DNA writers are used or when the 
number of potential memory states is limited. Analog recording (a gradual accumulation of 
mutations in the population in response to an input) can be implemented when moderately 
efficient writers are used or when there are many potential memory states. This extended dynamic 
range enables one to infer information regarding signal intensity and duration, which are analog 
properties, as opposed to the absence or presence of a signal, which is digital information. 


Sequential and temporal resolution 


Analog recorders integrate a signal over time but do not necessarily preserve information about the 
relative order or timing of multiple signals or the recurrence of a signal. Memory architectures with 
the capacity to record sequential and temporal information have been developed by using site- 
specific recombinases (7), base editing (6), and Casl-Cas2 (3), although their resolution, write 
cycles, and recording capacity still need to be improved for demanding applications. Notably, ticker 
tape memory architectures (18) that record signaling dynamics in a temporally resolved fashion 
could enable one to infer signal intensity as a function of time. 
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study time-averaged neural activities. For exam- 
ple, neural activities can be linked to molecular 
recorders via neural activity-responsive regulatory 
elements, such as immediate early gene promoters 
(19). Live animals harboring these genetic re- 
corders could then be subjected to different 
neural stimuli, and the resulting mutational 
signatures could be used to infer time-averaged 
activities across the entire animal brain. Alter- 
natively, DNA writers encoded on mobilizable 
genetic elements that can pass through synapses, 
such as rabies or pseudorabies viruses, could be 
used to distinctively mark neural connections by 
DNA barcodes that could then be used to map 
the connectome in a high-resolution and high- 
throughput fashion (20). Despite many technical 
challenges, we envision that applying molecular- 
recording technologies to decipher the functional 
architecture of the brain will be a strong driving 
force for the advancement of these technologies, 
especially in terms of scalability, recording ca- 
pacity, and temporal and spatial resolution. 


Evolutionary cellular engineering 
Continuous in vivo evolution 


In vivo DNA-writing technologies could be used 
to recurrently mutate desired genomic segments 
and achieve targeted genetic diversification within 
a short period. Once coupled with continuous 
selection, this strategy could enable continuous 
rounds of evolution to improve cellular traits of 
interest or to quickly evolve protein and RNA 
scaffolds for biotechnological and therapeutic 
applications (Fig. 1C, left). Unlike in molecular- 
recording applications, where it is desirable to 
minimize fitness effects to achieve robust record- 
ing, in evolutionary engineering applications, a 
selective pressure is applied to direct evolu- 
tionary trajectories toward desired outcomes. 
DNA-writing technologies with relaxed (27) or 
obviated (2) requirements for cis-encoded ele- 
ments and extended mutational spectra (22) could 
be especially useful for these applications. 


Synthetic Lamarckian evolution 


Living cells have evolved mechanisms to elevate 
their local mutation rate under certain condi- 
tions and in response to specific signals. For ex- 
ample, during antibody maturation, CRISPR-Cas9 
spacer acquisition, and mutagenesis processes 
mediated by diversity-generating retroelements 
in phages and bacteria, a series of actively reg- 
ulated molecular events lead to targeted muta- 
genesis in certain genomic loci. These Lamarckian 
evolutionary strategies can increase the overall fit- 
ness of cell populations in uncertain environments 
and help them to adapt to environmental changes 
at greater rates than are possible by random 
Darwinian mutations. DNA-writing technologies 
could be used to emulate Lamarckian evolution 
by increasing the local mutation rates of desired 
genetic loci in response to signals of interest 
and in the presence of suitable selective pressures 
(Fig. 1C, right). Cells engineered with such a ca- 
pacity could evolve faster than possible by natural 
evolution and enable adaptive cell-based thera- 
peutics that tune their responses to the conditions 
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they encounter. Alternatively, engineered bacte- 
riophages endowed with the capacity to target and 
mutagenize their own host-range determinants 
could be useful for the streamlined development 
of phage-based antimicrobials that could adapt to 
infect new hosts faster than natural phages. 


Applications specific to precise 
DNA writers 

Layered molecular-recording, 
computation, and artificial-learning 
gene circuits 


The precise and well-defined nature of 
the mutational outcomes generated by 
precise DNA writers allows them to be 
layered into more sophisticated genetic 
circuits in which the mutational outcome 
of one element can be used as inputs for 
other elements. By doing so, information 
regarding a series of input signals can be 
recorded in the form of well-defined tran- 
sitions between multiple memory states. 
This strategy has been used to encode var- 
ious forms of combinatorial, sequential, 
and temporal logic and other increasingly 
complex computing operations in living 
cells (Fig. 1D) (6-10). Additionally, because 
of the predictable nature of precise writers, 
their mutational output can be linked to 
functional genetic elements and used to 
control gene expression. These ration- 
ally designed genetic programs could be 
used, for example, to study or control the 
sequence and timing of developmental 
programs or to build gene circuits that 
classify disease conditions on the basis of 
multiple inputs. In addition, genetic pro- 
grams could be created to endow cells 
with artificial learning capabilities such 
that specific circuit responses are grad- 
ually reinforced (or degraded) in response 
to signals (6), much like the reinforce- 
ment of synaptic interconnections in 
neurons. 


High-throughput interaction and 
activity mapping 

Transient cellular events, such as protein- 
protein interactions, can be converted 
into transcriptional outputs and therefore 
captured into DNA memory. For example, 
a split DNA-writing system, where the 
N- and C-terminal domains of a precise 
DNA writer are fused to barcoded bait 
and prey, respectively, could be used to 
record protein-protein interactions (Fig. 
1E, left). In cells harboring interacting 
partners, a functional DNA writer could 
be reconstituted and write a prey-specific 
barcode next to a bait-specific barcode. 
The joined barcode could then be re- 
trieved by sequencing to identify inter- 
acting partners in pooled libraries in a 
high-throughput fashion. Analogous strat- 
egies could be used to study the activities 
of RNA and protein variant libraries in a 
high-throughput fashion (Fig. 1E, right). 
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Application specific to pseudorandom 
DNA writers: lineage tracing 

Capturing cellular ancestry relationships during 
development and creating corresponding line- 
age maps, especially in larger animals, have been 
a long-standing challenge in developmental biol- 
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Fig. 2. Precise DNA-writing technologies. (A) Site-specific 
recombinases. (B) Recombineering. (C) Base editing. 

CDA, cytidine deaminase; d/nCas9, dCas9 or nickase Cas9. 
(D) Digital versus analog recording (see Box 1). 
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ogy. Traditionally, various static genetic and non- 
genetic barcoding approaches have been used for 
lineage tracing (23-25). In these methods, once a 
cell receives a barcode, it passes the barcode to 
its progenies with no change. Therefore, lineages 
that are generated in later stages are not dif- 


ferentially barcoded and, as a result, only 
a low-resolution lineage tree can be con- 
structed. DNA writers can be used to de- 
vise dynamic genetic barcoding schemes 
that continuously and distinctively mark 
cell lineages as they progress in vivo, thus 
enabling higher-resolution lineage maps 
(Fig. 1F). Lineage tracing can be con- 
sidered a specific application of molecular 
recording, where, instead of a transient 
signal, the chronicle of transient events 
(e.g., cell divisions) is recorded in DNA and 
later retrieved by sequencing. Pseudo- 
random writers are especially useful for 
lineage-tracing applications because they 
can generate many distinct mutational 
signatures in an initially clonal population. 


Precise DNA writers 


Three classes of precise DNA writers have 
been described to date (Table 1), each fea- 
turing a different DNA-writing efficiency 
and thus a different recording regime (Box 1). 
Site-specific recombinases are the most 
efficient and well-established class of pre- 
cise DNA writers. Depending on the orien- 
tation of their DNA recognition sites, these 
enzymes can either flip or excise a piece of 
DNA that lies between their cognate sites, 
thus memorizing the history of exposure 
to a signal in the form of defined and per- 
manent DNA reconfiguration (Fig. 2A, 
transition from SO to S1). Because of their 
relatively high efficiency, these DNA writers 
have been used mainly in digital recording 
(Box 1 and Fig. 2D) and building layered 
synthetic gene circuits for digital compu- 
tation (7-9). 

The second class of precise DNA writers 
relies on reverse transcriptase (RT)- 
mediated in vivo single-stranded DNA 
(ssDNA) expression followed by recombine- 
ering to achieve cis element-independent 
DNA writing in bacteria (Fig. 2B) (2). The 
moderate writing efficiency of this sys- 
tem offers wider-dynamic-range molecular 
recording in which the analog properties 
of biological signals, such as signal inten- 
sity and exposure duration, are recorded 
into the overall genomic DNA of cell pop- 
ulations (Box 1 and Fig. 2D). Because these 
DNA writers do not require cis-encoded 
elements on the target, they are desirable 
for evolutionary engineering applications. 

The third class of precise DNA writers 
performs nucleotide-resolution manipula- 
tion of DNA via base editing (26). In this 
system, a base editor, such as a cytidine 
deaminase domain fused to dead Cas9 
(dCasQ), is addressed to a desired target 
site by expression of a complementary 
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DNA writers. These random memory states individually, or in combination temporal information into a CRISPR array. 


Table 1. Features and demonstrated applications for the current DNA-writing technologies. TBD, to be determined; RSM, recombinase-based state 
machine; BLADE, Boolean logic and arithmetic through DNA excision; SCRIBE, synthetic cellular recorders integrating biological events; CAMERA, CRISPR- 
mediated analog multi-event recording apparatus; DOMINO, DNA-based ordered memory and iteration network operator; GESTALT, genome editing of 
synthetic target arrays for lineage tracing; MEMOIR, memory by engineered mutagenesis with optical in situ readout; mSCRIBE, mammalian synthetic 
cellular recorders integrating biological events; TRACE, temporal recording in arrays by CRISPR expansion. 
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guide RNA (gRNA), generating deoxycytidine 
(dC)-to-deoxythymidine (dT) mutations within a 
narrow window in the target vicinity (Fig. 2C). 
As these memory operators are CRISPR-Cas9 
based, they are more scalable than other precise 
writers. Additionally, they can be functionalized 
with regulatory modules (such as CRISPR inter- 
ference and activation) to achieve complex record- 
ing and computation operations in living cells (6). 
Recently, an adenosine deaminase base editor, which 
writes deoxyadenosine (dA)-to-deoxyguanosine 
(dG) mutations, was developed (22), further ex- 
panding the mutation spectrum and utility of this 
class of DNA writers and paving the way toward 
bidirectional DNA-writing systems that could be 
used for advanced computation and evolutionary 
engineering applications. 


“ there is plenty of room for improving 
existing memory architectures or 
developing new ones with desirable 
features, especially in terms of recording 
capacity, scalability, robustness, fitness 
effects, cellular resource consumption, 
write cycles, temporal resolution, 


and recording kinetics.” 


Pseudorandom DNA writers 

Two main classes of pseudorandom DNA writers 
have been described to date. The first class relies 
on targeted double-stranded DNA (dsDNA) breaks 
generated by site-specific nucleases, such as CRISPR- 
Cas9, followed by error-prone repair of the breaks 
by the nonhomologous end-joining (NHEJ) path- 
way (11). During this process, each individual cell 
can acquire a pseudorandom mutational signature 
(i.e., indel mutations) in the target locus. Several 
studies have used these mutational signatures 
as barcodes to trace cellular lineages during em- 
bryo development in zebrafish and other small 
animals or in situ in cell cultures (Fig. 3A) (77-13). 
Efforts to extend the write cycles (see Box 1) of 
these molecular recorders led to the development 
of evolving barcodes (4, 14). This memory archi- 
tecture was built by engineering a protospacer 
adjacent motif (PAM) into the gRNA-encoding 
locus, resulting in self-targeting gRNAs (stgRNAs) 
that undergo iterative barcoding cycles during 
which the stgRNA locus is repeatedly diversified 
(Fig. 3B). This memory architecture was lever- 
aged to build a population-level analog recorder 
(4) and to dynamically barcode cellular lineages 
in mammalian cells (J4). 

Despite these improvements, the reliance of 
these memory architectures on dsDNA breaks 
and NHEJ still limits their write cycles and makes 
them unsuitable for usage in organisms that lack 
an efficient NHEJ pathway, such as most prokar- 
yotes. The prevalence of deletions in NHEJ can 
result in shortening of the stgRNA and loss of 
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the PAM, thereby rendering the recorder non- 
functional over time. Moreover, the stochastic and 
deletion-based nature of the mutations generated 
by these strategies can result in nonpersistent 
encoding, where new memory states overwrite 
previous ones, thus making it difficult to infer 
ancestral relationships. Furthermore, encoding 
multiple stgRNAs in the same cell could result 
in unwanted chromosomal rearrangements and 
cellular toxicity. To extend the use of DNA writers 
for high-resolution lineage-tracing applications, 
particularly in larger animals, new memory archi- 
tectures with improved efficiency, extended write 
cycles, reduced toxicity, and persistent (i.e., non- 
deletion-based) barcoding are desired. Alternative 
strategies, for example, using C- or A-rich stgRNAs 
in combination with base editors, could be de- 
vised to combine the high stor- 
age capacity of pseudorandom 
memory architectures with the 
well-defined and persistent mem- 
ory states offered by precise DNA 
writers to address some of the 
above-mentioned limitations. 
The second class of pseudo- 
random DNA writers was built 
upon Cas1 and Cas2 proteins, 
which naturally mediate spacer 
acquisition in the CRISPR bac- 
terial immune system. In this 
system, the Casl-Cas2 complex 
samples the intracellular ssDNA 
pool, which can originate from 
various intracellular or extra- 
cellular sources, and integrates short (~20 to 
30 base pairs) ssDNA fragments from this pool 
into a preexisting CRISPR array, resulting in 
extension of the array over time (Fig. 3C). New 
spacers are added to the leader-proximal site, so 
the chronological order of spacer addition events 
is preserved within the array configuration. By 
placing the expression of the Cas1-Cas2 cassette 
under the control of signal-inducible promoters 
and introducing exogenous oligonucleotides into 
Escherichia coli cell populations, Shipman et ai. 
(27) demonstrated that the signal intensity and 
duration can be inferred from array extensions 
and that the temporal order of the addition of 
the oligonucleotide pools can be inferred from 
the array composition. In a follow-up study, the 
authors demonstrated that artificial digital in- 
formation, such as small pictures and movies, 
could be encoded into oligonucleotide pools and 
recorded into the distributed genomic DNA of a 
cell population (28). Building on these results, 
Sheth ez al. (3) showed that instead of providing 
exogenous oligonucleotides, the intracellular 
ssDNA pool could be dynamically modulated 
by using template plasmids with tunable copy 
numbers. Using this strategy and multiple tem- 
plate plasmids, the authors demonstrated that 
the temporal order of multiple signals and line- 
age information of bacterial populations could be 
recorded in the CRISPR array composition (Fig. 
3D). Though the Cas1-Cas2 writing system offers 
a relatively persistent memory architecture with 
desirable features, such as sequentially and 
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temporally resolved recording and extended write 
cycles, it is currently limited to bacteria. The sys- 
tem could offer an attractive strategy for lineage 
tracing if it could be adapted to eukaryotes and 
function stably over multiple generations. 


Conclusion and future prospects 


In the past few years, we have witnessed the 
transition from the read-only genomic era to the 
read-and-write era. DNA-writing technologies have 
transformed genomic DNA into a dynamic me- 
dium for processing and storing biological and 
artificial information in living cells. These advances 
herald a new generation of powerful approaches 
for investigating and engineering in situ biology 
in basic research, biotechnology, and medicine. Al- 
though substantial progress has been made, there 
is plenty of room for improving existing memory 
architectures or developing new ones with desir- 
able features, especially in terms of recording 
capacity, scalability, robustness, fitness effects, 
cellular resource consumption, write cycles, tempo- 
ral resolution, and recording kinetics. These tech- 
nologies promise to further advance our ability to 
manipulate life’s natural memory storage media in 
a dynamic, longitudinal, and multiplexed fashion. 
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TECHNOLOGIES TRANSFORMING BIOLOGY 


REVIEW 


Single-particle cryo-EM—How did it 
get here and where will it go 


Yifan Cheng 


Cryo-electron microscopy, or simply cryo-EM, refers mainly to three very different yet 
closely related techniques: electron crystallography, single-particle cryo-EM, and 
electron cryotomography. In the past few years, single-particle cryo-EM in particular has 
triggered a revolution in structural biology and has become a newly dominant discipline. 
This Review examines the fascinating story of its start and evolution over the past 
40-plus years, delves into how and why the recent technological advances have been so 
groundbreaking, and briefly considers where the technique may be headed in the future. 


hysicist and Nobel laureate Richard Feyn- 

man once famously stated, “It is very easy 

to answer many fundamental biological 

questions; you just look at the thing!” (7). 

Indeed, the central idea behind structural 
biology is that once we are able to “look” at “things” 
in great enough detail to discern their atomic 
structures, we will naturally be able to answer 
how and why the components and players of 
complex biological processes work the way they 
do. True to this aim, structural biology has con- 
tributed substantially to major biological discov- 
eries throughout history (2). It has also and will 
continue to facilitate developments of therapeu- 
tic agents to cure diseases or ameliorate patho- 
logical symptoms. 

The major techniques available to structural 
biologists are x-ray crystallography, nuclear mag- 
netic resonance (NMR) spectroscopy, and electron 
microscopy (EM). Among them, x-ray crystallog- 
raphy contributes most of the atomic coordinates 
of biological macromolecules deposited in the 
Protein Data Bank (PDB) (3). In this method, 
structures are determined from diffraction patterns 
generated from well-ordered three-dimensional 
(3D) crystals of biological macromolecules. The 


Fig. 1. Establishing single-particle cryo-EM. (A) An electron diffraction 
pattern of frozen hydrated catalase crystal. [Reprinted from (9) with 
permissions from Elsevier] The diffraction spots are visible at beyond 3-A 
resolution. This experiment established the concept of cryo-EM. (B) An 
electron micrograph of frozen hydrated adenovirus particle recorded from a 
frozen hydrated grid prepared with plunge freezing. The micrograph is from the 


Cheng, Science 361, 876-880 (2018) 31 August 2018 


resolutions of structures determined depend largely 
on the quality of the crystals; in short, obtaining 
well-ordered 3D crystals of sufficient size is usually 
a prerequisite for atomic structure determination 
of any biological macromolecules by using this 
technique. It works well for many proteins or 
stable complexes; however, for certain categories 
of biological macromolecules, growing large and 
well-ordered 3D crystals is a very difficult or im- 
possible task. For example, crystallizing integral 
membrane proteins or large and dynamic com- 
plexes and machineries can be challenging. An 
extension of the x-ray crystallography is the x-ray 
free electron laser (XFEL), whose ultimate goal is 
to determine atomic structures without crystals 
but currently still requires a large amount of small 
crystals (4). 

Can atomic structures of biological macromole- 
cules be determined without crystallization? Or 
is it possible, as Feynman once suggested, to de- 
termine their structures by “looking” at them 
by using a powerful electron microscope? Early 
pioneers pursued this question in the 1970s and 
developed a new EM-based method known today 
as single-particle cryo-electron microscopy (cryo- 
EM) (3, 6). At the beginning, the method yielded 


rather low-resolution results. Drawn to the prom- 
ise of being able to study biological macro- 
molecules without crystallizing them, however, 
the cryo-EM community dedicated itself to per- 
fecting the technique over more than four de- 
cades, yielding steady improvements in both the 
technique’s applicability and the resolution of 
its results. Gradually, it has become a major tool 
in structural biology, complementary to x-ray 
crystallography and widely used to study large 
macromolecular complexes that are difficult to 
be crystallized. A few years ago, some amazing 
technological breakthroughs further enabled rou- 
tine atomic resolution structure determinations 
with this method. Today, single-particle cryo-EM 
is no longer a complementary technique but a 
dominant one, changing the field of structural 
biology in a profound and unprecedented way 
and facilitating major new discoveries. 


A brief history of single-particle cryo-EM 


It is actually quite difficult to “look” at biological 
macromolecules in an electron microscope. De- 
termining their atomic structures from electron 
micrographs is even more complicated. First, EM 
images are 2D projections of biological macro- 
molecules, but not their 3D structures. This was 
resolved by De Rosier and Klug, who demon- 
strated that a 3D structure can be reconstructed 
by combining 2D projection images of the same 
object along different directions (7). 

Second, because of strong scattering, the elec- 
tron beam has to be confined in a high vacuum, 
and all EM samples need to be placed inside said 
vacuum. This is not a problem for inorganic ma- 
terials. But if one simply places a biological sam- 
ple inside an electron microscope, vacuum-caused 
dehydration would destroy the sample’s struc- 
tural integrity. The seemingly impossible task 
of keeping protein samples hydrated in a high 


Howard Hughes Medical Institute, Department of 
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Francisco, San Francisco, CA, USA. 


same dataset described in (11). [Reprinted from (9) with permissions from 
Elsevier] (C) The first 3D reconstruction of bacteriorhodopsin determined with 
electron crystallography. [Reprinted from (15) with permission from Springer 
Nature] (D) A 3D model of the 50S ribosome subunit determined with 
single-particle reconstruction of a negatively stained large ribosomal subunit 
from Escherichia coli. [Reprinted from (22) with permission from Wiley] 
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vacuum was accomplished by Taylor and Glaeser. 
They recorded better than 3-A resolution electron 
diffraction patterns from frozen hydrated catalase 
crystals, demonstrating that the structural integ- 
rity of biological macromolecules in a high vac- 
uum can be maintained through frozen hydration 
(Fig. 1A) (8, 9). The practical implication of this 
approach, however, was not easy until a plunge 
freezing technique was developed by Dubochet 
and colleagues in the 1980s (10, 11). They applied 
purified protein samples in solution to an EM grid 
covered with a thin layer of carbon holey film 
and blotted the grid with a filter paper, which 
removed most of the solution, and surface tension 
drove the remaining solution into a thin liquid 
film across holes in the carbon film. Plunging the 
grid rapidly into liquid ethane cooled by liquid 
nitrogen froze it into a thin layer of amorphous 
ice with the protein sample embedded within it 
in random orientations. After that, the frozen grid 
was transferred into an electron microscope and 
kept at near-liquid nitrogen temperature for imag- 
ing (Fig. 1B). This method is still used routinely 
without major changes, except that we now use 
a machine to blot and plunge grids with tunable 
parameters. 

Third, radiation damage by the high-energy 
electron beam limits the total electron dose that 
can be used to image biological samples. The con- 
sequence is that images recorded with such a low 
electron dose have very poor signal-to-noise ratios 
(SNRs). Cooling the sample to liquid nitrogen or 
even liquid helium temperature can reduce the 
radiation damage and allow the sample to toler- 
ate a higher electron dose (22, 13) but still far from 
being able to directly visualize high-resolution 
details from raw micrographs. Henderson and 
Unwin overcame this problem by use of a crystal- 
lographic approach, averaging images of many 
identical proteins packed as 2D crystals (74). Images 
of glucose-embedded 2D crystals recorded with 
very low electron doses show no visible features, 
but their Fourier transforms show clear reflec- 
tions. Similarly, electron diffraction from such 
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Fig. 2. Direct-electron-detection camera—enabled atomic structure 
determination. (A) Comparison of DQE curves of a scintillator-based CCD 
camera (black), and direct electron detection camera K2 operating in 
base mode (blue) and super-resolution counting mode (red) (31). (B) A 
typical electron micrograph of archaeal 20S proteasome (~700 kDa in 
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2D crystals at very low electron doses produced 
good-quality diffraction patterns. Combining the 
phases calculated from the Fourier transforma- 
tions of images and the amplitudes obtained from 
diffractions produced a high-resolution projection 
map of the specimen (/4). Combining data col- 
lected from specimens tilted at different angles 
produced a 3D reconstruction similar to the density 
map of an x-ray crystal structure. This approach 
produced the first structure of an integral mem- 
brane protein, initially at ~7-A resolution (Fig. 
1C) (5) and finally at atomic resolution (J6). The 
method became known as electron crystallogra- 
phy, which relies on well-ordered 2D crystals. This 
method has produced atomic structures of several 
integral membrane proteins (17, 18) and one solu- 
ble protein (79). The highest resolution achieved 


“The resolution of a single 
particle cryo-EM structure 
depends on many factors...” 


was 1.9 A, resolving a lipid bilayer surrounding 
an aquaporin-0 (AQPO) water channel (20). But 
the difficulty of growing well-ordered 2D crystals 
hinders the broad application of the method. 

In parallel, Frank proposed an idea to deter- 
mine protein structures without crystallization: 
computationally combining images of many indi- 
vidual protein particles of the same type (5). This 
conceptually novel idea was first tested by using 
protein samples that were negatively stained for 
EM observation (Fig. 1D) (21, 22). The later com- 
bination of this approach with the plunge freez- 
ing sample preparation became what we now call 
“single-particle cryo-EM.” It does not require grow- 
ing proteins into crystals of any form. Instead, it 
determines structure by computationally align- 
ing and combining cryo-EM images of many 
biological molecules randomly oriented within a 


thin layer of vitreous ice. A large number of images 
is needed to both enhance SNR and to provide all 
different views needed for 3D reconstruction. 
More detailed technical descriptions of this 
method can be found in many recent reviews, 
such as (23, 24). 

Furthermore, what electron micrographs record 
are projections of the specimen convoluted by 
a contrast transfer function (CTF). CTF is a sine 
function oscillating in a frequency-dependent 
manner that modulates both phase and amplitude 
of an image in frequency space (25). In addition 
to a number of microscope-dependent param- 
eters, CTF is determined by how much off the 
focus an image is recorded—the so-called “defocus.” 
To retain the highest resolution, images must be 
recorded very close to focus. Such images, how- 
ever, have very limited contrast. This is not a prob- 
lem for radiation-insensitive inorganic materials, 
which can be imaged with very high electron 
doses so as to generate sufficient contrast while 
retaining a high-resolution signal. However, images 
of radiation-sensitive frozen hydrated biological 
samples have to be recorded with a large defocus 
in order to generate sufficient contrast, which in 
turn dampens the high-resolution signal. 

The resolution of a single-particle cryo-EM 
structure depends on many factors, including the 
resolution and contrast of individual particle 
images, accuracy of aligning these images with 
each other, obtaining a sufficient number of images 
from all necessary views of the macromolecule 
within a reasonable time frame, the conforma- 
tional and compositional homogeneity of these 
particles, and access to powerful enough computers 
with which to process images efficiently. Less than 
optimal conditions for any of these factors—for 
example, instability of the electron microscope, 
relatively poor performance of the image-recording 
device and beam-induced sample motions, accu- 
racy of classifying and aligning particle images 
limited by image quality and computational al- 
gorithms, and limited computer power to process 
very large numbers of particle images—can limit 


molecular weight) recorded by use of a direct-electron-detection camera 
(60). (C) Fourier power spectrum calculated from the image in (B). 

[(B) and (C) reprinted from (60) with permission from Elsevier] High- 
resolution signal at ~3 A resolution is clearly visible (60). (D) A portion of 
density map from the 3D reconstruction of archaeal 20S proteasome (31). 
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the resolution. Because of these many technical 
challenges, the resolution achieved was, for a 
long time, limited to levels far from sufficient for 
deriving de novo atomic models. 

At a time when the resolution of some best 
reconstructions was at the 30- to ~50-A range [for 
example, the mammalian 40S ribosome (26) and 
ryanodine receptor (27)], microscopists were never- 
theless encouraged by the prediction that single- 
particle cryo-EM could, theoretically, achieve atomic 
resolution (28). The prediction was made by con- 
sidering how much electron scattering a biological 
sample could tolerate, how much structural in- 
formation or SNR such scattering can produce, 
and—in a perfect situation—how many images 
would be needed to produce a reconstruction at 
a given resolution. Although the claim was bold, 
the theories behind are solid, and this motivated 
the cryo-EM community to push the methodol- 
ogy toward its perfection. At last, 20 years after 
Henderson made the prediction, the necessary 
breakthrough came when direct electron detec- 
tion cameras were developed and became com- 
mercially available (29). The expanded capabilities 
of the new cameras coupled with unprecedented 
and ever-increasing computational power fueled 
the development of new computational algorithms 
so that cryo-EM images could be reliably produced 
with sufficient quality for atomic structure deter- 
mination (30, 3D). The resolution potential of single- 
particle cryo-EM at long last became a reality (32). 


Transformative technological 
breakthroughs 
The direct electron detection camera 


Since the beginning of cryo-EM, electron micro- 
graphs were recorded on photographic films, 
which were subsequently processed in dark rooms 
and digitized for computational processing. For 
single-particle cryo-EM, the performance of film 
was sufficient to produce 3D reconstructions at 
subnanometer resolutions (33), at which a-helices 
are resolved, but atomic models cannot be built. 
It was difficult to push the resolution further 
because photographic film is particularly poor 
at retaining low-frequency signals. Images were 
therefore recorded with high defocuses in order 
to generate sufficient contrast for particle picking 
or alignment, at the price of losing high-resolution 
signal. For large icosahedral viruses, it was pos- 
sible to record two images of the same specimen 
area, the first with a low defocus to retain the 
high-resolution signal and the second with a high 
defocus to generate sufficient contrast (34). With 
substantial effort, the resolution of some virus 
particles reached near atomic level (35), but the 
whole process is slow and tedious, limiting the 
throughput of structure determination. 

In the late 1990s, charge-coupled device (CCD) 
cameras were introduced to record EM micro- 
graphs digitally. CCD cameras cannot detect elec- 
trons directly and require a phosphor scintillator 
to convert electrons into photon signals. Such con- 
version blurs a point event of a single electron 
striking the sensor into a blob of photons of much 
larger size and reduces high-resolution signals 
(36). Characterized by detective quantum effi- 
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ciency (DQE), which measures the level of signals 
retained by a camera in spatial frequency (36, 37), 
CCD cameras are not suitable for routine high- 
resolution structures determinations (Fig. 2A). 
The impact of introducing CCD cameras into cryo- 
EM was nonetheless important because it facili- 
tated automated image acquisition (38). 

A major breakthrough that elevated single- 
particle cryo-EM from “blobology” to a practical 
technique for atomic structure determination 
was the introduction of direct electron-detection 
cameras (29). Such cameras detect charges gen- 
erated directly from electrons striking the camera 
sensor, thus localizing the electron with much 
greater precision and resulting in substantially 
higher DQE than that of scintillator-based cam- 
eras. These sensors also run at high frame rates, 
enabling cryo-EM images to be recorded as a 
stack of movie frames that each is recorded in a 
short period of time (39). Certain cameras can 
even count individual electron events on every 
single frame. The DQE of such single-electron- 
counting cameras is even higher (Fig. 2A) (31). 
Images recorded with direct-electron-detection 
cameras retain signal both at high frequency, for 
high-resolution structure determination, and at 
low frequency, for contrast required for image 
alignment (Fig. 2, B and C). 

Being able to record images as movie stacks 
facilitated many new imaging approaches that 
are critical for maximizing the achievable reso- 
lution. Most importantly, it allows the correction 
of beam-induced image motion (30, 31, 39) and 
partially mitigates radiation damage (31, 40, 4D), 
solving the two most difficult problems in cryo- 
EM. The use of the direct detection camera, im- 
proved motion correction, and the ability to record 
images at a high electron dose make the recon- 
struction of 3D density maps at atomic resolu- 
tion possible for many proteins. 


New image processing algorithms 


The resolution of a reconstruction also depends 
on the conformational homogeneity of the sample 
and accuracy of image alignment of all particles 
used to reconstruct the density map. Image clas- 
sification and alignment are thus the two most 
critical steps in the computational image pro- 
cessing. Because each particle image has a poor 
SNR, individual particles cannot be classified and 
assigned to a specific class and orientation with 
certainty, making a probabilistic approach better 
suited for image classification and alignment. Early 
attempts to use a maximum likelihood-based prob- 
abilistic approach in cryo-EM image processing 
were made in the late 1990s (42). Later, a Bayesian 
approach to cryo-EM reconstruction was de- 
scribed (43) and implemented in RELION (44), 
a user-friendly program that soon became very 
popular in single-particle cryo-EM image pro- 
cessing. This approach is more powerful and ro- 
bust than traditional deterministic approaches, 
particularly in classifying a subset of particle 
images with homogeneous conformations out of 
a larger and more heterogeneous dataset. Coin- 
cidentally, this development happened at around 
the same time that direct-electron-detection cam- 


eras were becoming widely used in cryo-EM. 
Together, they allowed the full potential of single- 
particle cryo-EM—as theoretically predicted more 
than 20 years earlier—to be realized. 


Automation in electron microscopy 


Technically, single-particle cryo-EM had been a 
rather complicated and tedious technique. A very 
large number of high-quality cryo-EM images are 
required for each reconstruction. Such images 
were mainly collected by individuals who had 
many years of training and experience in operat- 
ing complicated electron microscopes. The tech- 
nical requirement on users was high because they 
needed a good understanding of not only the 
electron optic system but also many fundamental 
technical issues related to cryo-EM data acquisi- 
tion. They also needed to be very patient, sitting in 
front of a microscope for many hours repeating the 
same procedure in order to collect large numbers 
of micrographs. The level of expertise required 
made single-particle cryo-EM inaccessible to the 
wider structural biology community. Fortunately, 
pioneered by Carragher and Potter, who recog- 
nized this problem early on, automation of high- 
quality data acquisition was developed (38). The 
electron microscope itself also evolved to become 
better suited for automated data acquisition. Now, 
many cryo-EM facilities are operated in a way sim- 
ilar to x-ray synchrotron beamlines—supported 
by a few highly trained staff scientists. In such 
facilities, regular users with minimal training can 
also acquire high-quality cryo-EM data, even re- 
motely, by using automated procedures. 


A new era of structural biology 


Thanks to these technological breakthroughs 
in single-particle cryo-EM, structural biology 
has entered a new era. Structures of many dif- 
ficult crystallization targets are now within reach. 
One such area is integral membrane proteins. A 
specific example is the transient receptor poten- 
tial (TRP) ion channel. The TRP channel super- 
family contains seven subfamilies with a total of 
27 members in humans (44, 46). Each of these 
channels plays different physiological roles; some 
of them are drug targets for treatment of various 
human diseases (47). With the exception of some 
small domains, attempts to crystallize any mem- 
ber of the TRP channel superfamily had failed (48). 
This lack of structural information ended when 
atomic structures of the TRPV1 ion channel—a 
capsaicin receptor that plays a physiological role 
in sensing heat and activating pain pathways— 
were determined by means of single-particle cryo- 
EM in three different functional states (49, 50). 
This discovery demonstrated the power of single- 
particle cryo-EM and showed that it rivaled x-ray 
crystallography in determining atomic structures 
of challenging protein complexes that resist crys- 
tallization. The TRPVI structures prompted many 
crystallographers to think seriously about cryo- 
EM, and many quickly seized the opportunity to 
apply it to their favorite difficult targets. With the 
roadblock of crystallization removed, atomic struc- 
tures of integral membrane proteins are now being 
determined at a rapid pace. In less than 5 years, 


3 of 5 


8L0z ‘Z Jaquiajdas uo /Hio Beweouel9seoual0s//:dyjy Wo. papeojumoq 


aS} 7 
RRR! 


e 


bees os 
Py Org 
¥% 


es 
ALA 


men 


TRPV1, 3J5P 


(2013) (2015) 


PKD2/TRPP2, 5T4D 


(2016) 


Fig. 3. Single-particle cryo-EM enables atomic structure determination 
of TRP channels. Ribbon diagrams of atomic structures from each subfamily 
of TRP channel superfamily. They are TRPV1 (49), TRPAI (61), TRPM4 


atomic structures of at least one member of each 
of the seven TRP channel subfamilies have now 
been determined (Fig. 3). 

Single-particle cryo-EM is also game-changing 
for structural studies of many large and dynamic 
complexes and machineries that were impossible 
for crystallization. A traditional approach was to 
fit crystal structures of individual components 
or domains into lower-resolution cryo-EM densi- 
ty maps of the whole complex (57). Now, atomic 
structures of many large complexes are deter- 
mined directly by using single-particle cryo-EM. 
An excellent example is the spliceosome complex. 
Past efforts produced atomic structures of some 
small components (52), as well as low-resolution 
cryo-EM structures of the whole complex (53). 
Now, in only a few short years, atomic structures 
of the spliceosome in different functional states 
have been determined (54, 55). 

One can easily call forth many more examples 
to illustrate how the recent technological break- 
throughs in single-particle cryo-EM have changed 
how we tackle complex biological problems. The 
rapid pace of advancement in structural biol- 
ogy is unprecedented. It has also attracted major 
pharmaceutical companies, with many hoping to 
implement cryo-EM into structure-based drug 
discovery and optimization. 


The future 


Single-particle cryo-EM continues to move for- 
ward fast, with many new technologies being 
developed. One such example is the Volta phase 
plate, which is a thin carbon film placed in the 
back focal plane of the microscope’s objective lens. 
It adds a phase shift to the CTF so that an image 
recorded at near focus has good contrast (56), 
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facilitating the study of very small proteins (57). 
Another example is the development of a new 
sample preparation technology that is fundamen- 
tally different from the current blotting method 
(58). The new method promises many benefits, 
including reducing the total amount of sample 
needed from microliters to nanoliters. 

As impressive as these innovations have been, 
there are still many technical details that can be 
improved. Regularly achieving resolutions close 
to 2 Aand beyond is one goal; improving robust- 
ness and throughput is another. Once very high 
resolutions can be achieved reliably, pharmaceu- 
tical companies will be able to routinely use EM 
to speed up structure-based drug discovery. The 
range of what can be studied by means of single- 
particle cryo-EM can also be expanded to include 
smaller or bigger targets with higher resolution, 
as well as more dynamic complexes or assemblies 
with irregular shapes. 

There is still more that single-particle cryo-EM 
can offer to further biological discoveries beyond 
structure determination if we continue to push 
the boundary of this technology. For example, 
in theory, image classification can sort a cryo-EM 
dataset that contains particles of heterogeneous 
conformation or composition into multiple classes, 
each corresponding to a different functional state. 
By freezing cryo-EM grids at specific time points, 
it could be possible to derive structural infor- 
mation in a time-dependent manner (59). It may 
therefore be possible to understand and break 
down the complex cycles, movements, and processes 
of biological macromolecular complexes and ma- 
chineries step by step, in complete atomic detail. 
In the future, it may also be possible to study 


NOMPC/TRPN1, 5VKQ 
(2017) 


C (also named TRPN1) (64), PKD2 (or TRPP2) (65), 


and TRPML (66). The rapid pace of integral membrane protein structure 


with single-particle cryo-EM and is unprecedented. 


through affinity-purifying specific proteins directly 
from cells onto EM grids for single-particle cryo- 
EM studies. 

The past 40 years have seen the evolution of 
single-particle cryo-EM from blobology to a routine 
and powerful method for atomic structure deter- 
mination. The technique’s new popularity is at- 
tracting not only more users but also talents from 
different fields, ranging from physics, material 
science, and mathematics, to machine learning. 
This infusion of new ideas and a larger commu- 
nity heralds an even brighter future for single- 
particle cryo-EM full of method development, 
collaboration, and most importantly, wonderful 
new biological discoveries. 
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REVIEW 


Visualizing and discovering 
cellular structures with 
super-resolution microscopy 


Yaron M. Sigal, Ruobo Zhou, Xiaowei Zhuang* 


Super-resolution microscopy has overcome a long-held resolution barrier—the diffraction 
limit—in light microscopy and enabled visualization of previously invisible molecular details in 
biological systems. Since their conception, super-resolution imaging methods have 
continually evolved and can now be used to image cellular structures in three dimensions, 
multiple colors, and living systems with nanometer-scale resolution. These methods have 
been applied to answer questions involving the organization, interaction, stoichiometry, and 
dynamics of individual molecular building blocks and their integration into functional 
machineries in cells and tissues. In this Review, we provide an overview of super-resolution 
methods, their state-of-the-art capabilities, and their constantly expanding applications 

to biology, with a focus on the latter. We will also describe the current technical challenges 
and future advances anticipated in super-resolution imaging. 


luorescence microscopy has been central 

in shaping our understanding of the mo- 

lecular organization and interactions of 

biological systems. Its high molecular spec- 

ificity and multicolor imaging capability 
allow direct visualization of interactions between 
specific molecular species, and its low invasive- 
ness allows the study of living systems under 
physiological conditions. However, a main chal- 
lenge in fluorescence microscopy was the limited 
spatial resolution set by the diffraction of light. 
This resolution limit, first described by Ernst Abbe 
in 1873, restricts the smallest objects that can be 
resolved by conventional light microscopes. As 
aresult, objects separated by a distance smaller 
than approximately half of the wavelength of 
visible light, i.e., ~200 to 300 nm, are indistinguish- 
able, making many molecular structures in cells 
unresolvable. The advent of super-resolution imag- 
ing methods has shattered this limit. In this Re- 
view, we will provide an overview of the methods 
that surpass the diffraction limit in the far field, 
with emphasis on the new biological insights 
afforded by these methods. 


Overview of super-resolution 

imaging methods 

The key to overcoming the diffraction limit lies 
in the ability to distinguish molecules that reside 
within the same diffraction-limited volume. This 
has been achieved by two main categories of ap- 
proaches. The first category accomplishes this 
in a spatially coordinated manner by using pat- 
terned illumination to differentially modulate 
the fluorescence emission of molecules within 
the diffraction-limited volume and thereby achieve 
separate detection of these molecules. The pioneer- 
ing method in this category is stimulated emission 
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depletion (STED) microscopy (J, 2), subsequently 
generalized to reversible saturable optical linear 
fluorescence transitions (RESOLFT) (3). STED 
and RESOLFT overcome the diffraction limit by 
accompanying a focused excitation beam with a 
spatially patterned “depletion” beam, typically 
in a donut shape, which serves to counteract ex- 
citation through either stimulated emission (STED) 
(/, 2) or other types of fluorescence transitions, 
such as photoswitching (RESOLFT) (3). As a result, 
only molecules at the very center of the donut- 
shaped beam (where the laser intensity is near 
zero) can emit light, thus creating a region of 
fluorescence emission that is much smaller than 
atypical focal spot of the light microscope. The 
reverse strategy is also possible, with the donut 
beam serving as patterned activation rather than 
depletion, limiting the emission-free region in- 
stead of emission region to the center of the 
beam (4). Scanning these beams across the sample 
then generates an image with a resolution much 
higher than the diffraction limit. Various other 
illumination patterns can also be used to increase 
the spatial frequency of the emission region and 
hence the image resolution (4). For example, in 
structured illumination microscopy (SIM), the 
sample is excited by a series of standing waves with 
different orientations or phases to increase the 
spatial frequency detectable by the microscope 
(5). Because the standing-wave pattern is itself 
limited by diffraction, the linear form of SIM 
only extends the diffraction limit by a factor of 2, 
whereas the nonlinear form of SIM (NL-SIM) 
overcomes the diffraction limit by using the non- 
linear or saturated response of fluorophores to 
further increase the spatial frequency of the emis- 
sion pattern (5), similar to STED and RESOLFT 
(4). Unlike STED and RESOLFT, which generate 
super-resolution images directly from the rec- 
orded raw data, SIM and NL-SIM require addi- 
tional computational treatment to reconstruct 
final images (4, 5). 
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The second category of methods achieves the 
separation of molecules by stochastically turning 
on individual molecules within the diffraction- 
limited volume at different time points, includ- 
ing stochastic optical reconstruction microscopy 
(STORM) (6) and (fluorescence) photoactivated 
localization microscopy [(F)PALM] (7, 8), and sub- 
sequent variations of these approaches (9, 10). 
When isolated in space, the positions of individual 
molecules can be determined to nanometer or 
even subnanometer precision by localizing the 
center positions of their images (77-13). However, 
molecules within the same diffraction-limited vol- 
ume generate overlapping images, which is the 
fundamental cause of the diffraction limit in resolu- 
tion. STORM and PALM overcome this limit by 
switching on only a stochastic subset of fluores- 
cent molecules within a field of view at any given 
time such that their images do not substantially 
overlap, allowing their positions to be localized with 
high precision; these molecules are then switched 
off (or bleached) and a stochastically different sub- 
set of molecules are switched on and localized— 
iterating this process allows a super-resolution 
image to be constructed from numerous molec- 
ular localizations accumulated over time (6-8). 
Such stochastic activation of molecules is typi- 
cally achieved by using photoswitchable dyes 
or fluorescent proteins (6-0). A variety of photo- 
switchable probes have been used for this ap- 
proach, in some cases leading to the creation of 
different acronyms subsequently, but the imaging 
principle is the same as that for STORM and PALM. 
In addition to using photoswitchable probes, tran- 
sient binding of fluorescent probes can also be 
used to stochastically “turn on” fluorescent signals 
in space and time, as in point accumulation for 
imaging in nanoscale topography (PAINT) (74). 

Recently, a new super-resolution imaging 
method named MINFLUX has been developed 
that combines strengths from both categories 
of approaches, by using stochastic switching of 
individual molecules to enable the separate detec- 
tion of nearby molecules, along with patterned il- 
lumination, such as a donut-shaped beam, to 
achieve ultrahigh-precision localization of individual 
molecules by detecting local emission minima (5). 

In addition to the above methods, which di- 
rectly overcome the diffraction limit optically, a 
different form of super-resolution microscopy, 
expansion microscopy (ExM), has been recently 
developed, which increases the image resolution 
effectively through physical expansion of samples 
(16). In ExM, the specimen is embedded in a gel 
with the labeling probes attached to the gel. The 
sample is then digested to leave only the labeling 
probes attached to the gel followed by gel expan- 
sion to increase the probe separation, allowing 
super-resolution images to be taken with diffraction- 
limited microscopes. 

Super-resolution technologies are constantly 
expanding, including both variations of the above 
approaches and other distinct methods, such as 
fluctuation-based methods and computer-vision- 
based methods. Owing to the limited space of this 
short review and its focus on biological applica- 
tions, we cannot describe all methods here but 
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Fig. 1. Quantitative biological insights from three directions of applications 
of super-resolution imaging. (A) (i) STED images showing distinct distribution 
patterns of the envelope protein Env (red) in mature (left) and immature (right) 
HIV-1 particles attached to the cell, overlaid with the cell surface HIV-1 receptor 
CD4 (blue). (ii) PALM images showing the organization of ESCRT-| subunit Tsg101 
(green) in an HIV assembly site marked by HIV Gag proteins (red) in lateral (top) 
and axial (bottom) views. (iii) 3D STORM images of a sperm-specific calcium 
channel (CatSperl) showing four linear domains along the sperm flagella. The 
Z-position information is color-coded. (iv) PALM image on a bacterial cell showing 
the distribution of the ParA ATPase (green) with the ParB DNA binding protein 
(red) localized to the cell poles, for the coordination of chromosome segregation 
and cell division. (v) Left: Overlay of PAINT (red) and diffraction-limited (gray) 
images of the ER obtained using lattice light-sheet microscopy. Right: PAINT 
image from the left panel, but color-coded by the z-position information. White 
arrowheads indicate areas that appear as sheets in diffraction-limited images but 
are resolved as connected tubular structures in super-resolution images. (vi) STED 
image of the proapoptotic cell-death mediator Bax (green) showing ring structures 
in apoptotic mitochondria marked by Tom22 (red). (vii) STORM image showing 
interactions between mitochondria (green) and purinosomes marked by the core 
protein FGAMS (magenta). (viii) Comparison of STORM images of telomeric DNA 
in mouse embryo fibroblasts in the presence (left) and absence (right) of the 
shelterin protein TRF2 that is required for t-loop formation. (ix) Top: Comparison 
of diffraction-limited (left) and 3D STORM (right) images for DNA in a chromatin 
domain in the nucleus of Drosophila Kc167 cells. Bottom: Differential DNA 
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compaction of transcriptionally active (red), inactive (gray), or polycomb-repressed 
(blue) epigenetic domains visualized using STORM. (B) (i) PALM images of 
proto-oncogene cRAF clusters on the cell plasma membrane, with (bottom) and 
without (top) coexpression of KRAS®”, which induces cRAF clustering. (ii) 3D 
PALM image of molecular clusters with various sizes formed by a secretion 
system protein PrgH near the membrane of a bacterial cell. (iii) STORM images 
of endocytic vesicles displaying distinct vesicle size and phosphatidylinositol 
3-phosphate (PI3P) content. The number of PI3P binding sites on each vesicle (n) 
is indicated. (C) (i) Durations (tiap) for three lipid types—phosphoethanolamine 
(PE, gray), sphingomyelin (SM, red), and sphingomyelin after cholesterol depletion 
(SM COase, green)—that are differentially trapped in ~20-nm nanodomains at the 
plasma membrane, which are detected and distinguished by STED-FCS and 
confocal single-molecule tracking. (ii) Single-particle tracking of a 30S ribosomal 
subunit protein in a bacterial cell by using MINFLUX. Trajectories of individual 
molecules are shown in different colors. (iii) Time-lapse STED images of a region 
of the somatosensory cortex of a living mouse with enhanced yellow fluorescent 
protein (EYFP)-labeled neurons, showing dynamics of dendritic spines. 

(iv) Time-lapse STORM images showing fission (green arrowheads) and fusion 
(red arrowheads) events of mitochondria, with thin tubular structures connecting 
neighboring mitochondria as fission and fusion intermediates. Figures are modified 
from the following sources: (A) i (49); ii (60); iii (61), with permission from Elsevier; 
iv (62), with permission from Springer Nature; v (35); vi (64); vii (66); viii (69), 
with permission from Elsevier; ix (60), with permission from Springer Nature; 

(B) i (66); ii (67); ili (64); (C) i (69); ii (15); ili (7D; iv (4). 
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refer interested readers to other reviews (4, 9, 10) 
for additional coverage on super-resolution 
technologies. 


Imaging capabilities of 

super-resolution microscopy 
Three-dimensional (3D) imaging 

The 3D nature of biological structures calls for 
super-resolution in all three dimensions. For 
methods based on stochastic activation of single 
molecules, such as STORM and PALM, achieving 
3D super-resolution imaging requires high-precision 
localization not only in the ay plane, but also in 
the zg direction along the optical axis. This was 
first achieved by astigmatism imaging [by using 
a cylindrical lens to create z-dependent point- 
spread-function (PSF)] (77), followed by various 
other approaches including bifocal plane imaging 
(18), PSF engineering (19), and interferometry (20), 
among others (9, 10, 21). In STED and RESOLFT, 
isotropic 3D super-resolution imaging was achieved 
by generating a depletion illumination pattern to 
counteract excitation in all directions surrounding 
the focal point—for example, by using a donut- 
shaped STED beam in conjunction with two op- 
posing objectives (4, 22). 


Image resolution 


Both the methods based on patterned illumina- 
tion, like STED, RESOLFT, and NL-SIM, and the 
methods based on single-molecule switching and 
localization, like PALM and STORM, are diffraction- 
unlimited, and thus do not have a theoretical 
resolution limit. In practice, however, many factors 
can influence the achievable resolution, including 
the excitation and detection schemes, and the 
photophysical properties and size of fluorescent 
probes, as well as the labeling and sampling 
density of these probes. In biological applications, 
resolutions achieved by these methods are typi- 
cally in the range of 10 to 70 nm, with sub-10 nm 
resolution achieved in some cases (9, 10). 

For the patterned-illumination-based methods, 
the spatial frequency (or sharpness) of the final 
emission pattern determines the image resolu- 
tion. For example, in STED and RESOLFT, the 
donut-shaped depletion beam limits the fluores- 
cence emission zone to the very center of the 
donut beam. The stronger the depletion light, the 
narrower this emission zone and the higher 
the achievable image resolution (4). However, 
strong illumination can lead to substantial photo- 
bleaching, phototoxicity, and enhanced back- 
ground noise. Hence, the resolutions typically 
achieved are tens of nanometers, although res- 
olution as high as a few nanometers has also 
been demonstrated by using probes with ultra- 
high photostability, such as diamond nitrogen- 
vacancy centers (4). With isoSTED, isotropic 
3D resolution of ~30 nm has been demonstrated 
(4, 22). Combining patterned illumination with 
photoswitchable probes, RESOLFT (3, 4) has also 
achieved ~30-nm isotropic 3D resolution (23). 
Similarly, by combining sinusoidal patterned il- 
lumination and photoswitchable probes, and 
using additional computational image reconstruc- 
tion, NL-SIM has demonstrated ~45- to 60-nm 
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resolution in 2D using saturated depletion (SD NL- 
SIM) or patterned activation (PA NL-SIM) (24, 25). 
PA NL-SIM has been extended to 3D with the help 
of lattice light sheet microscopy (26), providing a 
resolution of ~120 to 230 nm in 3D (25). 

For single-molecule-switching-based methods, 
such as STORM and PALM, the resolution de- 
pends on the photophysical properties of the 
fluorophores. Although many fluorophores ex- 
hibit blinking or switching behavior, only those 
with sufficient brightness and proper on-off 
switching kinetics yield high-quality images 
(27). The achievable image resolution depends 
on the number of photons detected from in- 
dividual molecules, known as the photon budget. 
Typical experiments with bright photoswitch- 
able dyes provide ~20- to 30-nm ay resolution, 
whereas the resolution is worse for fluorescent 
proteins because of their lower photon budget. 
The resolution is often worse in the z direction, 
but the use of interferometry (20, 28, 29) or 
specially engineered PSF (30) can improve the 
g resolution to become equal to or even better 
than the vy resolution. For example, interfer- 
ometry can provide <10-nm 2 resolution, though 
a more complicated imaging setup is needed 
(20, 28, 29). In general, the resolution in both 
ay and z directions can be increased by improv- 
ing the photon budget of the fluorophores. For 
example, the development of ultrabright photo- 
activatable dyes allowed a resolution as high as 
a few nanometers to be achieved on biological 
structures using STORM (37). More recently, using 
stochastic binding of dye-labeled DNA probes, 
DNA-PAINT also achieved similar image resolu- 
tion on DNA-origami nanostructures (32). How- 
ever, the time required to detect such a large 
number of photons for each molecule substantially 
increased the acquisition time per image. The 
recently developed MINFLUX thus represents 
an important advance in that it uses patterned 
excitation to drastically increase the localization 
precision (or reduce the number of photons re- 
quired to reach a set precision), achieving an im- 
pressive localization precision of ~1 nm with an 
orders-of-magnitude lower photon budget (15). In 
ExM, the resolution depends on the number of 
rounds of sample expansion and the expansion factor 
per round, and a resolution of ~25 nm has been 
demonstrated with two rounds of expansion (33). 

Other factors can also limit the final image res- 
olution, such as the size of the fluorescent probes 
and the labeling density, which affect all super- 
resolution methods. For single-molecule-based 
approaches, methods that increase the number 
of times each target is sampled can also increase 
the final resolution if the resolution is limited by 
sampling density. This includes using fluorophores 
that undergo many on-off switching cycles (27), 
diffusible probes that can sample multiple loca- 
tions of the target (34), and PAINT approaches 
that sample the target numerous times using 
reversible probe binding (35). 

Finally, thick samples pose additional chal- 
lenges, including reduced localization precision 
for out-of-focus molecules, optical aberration, and 
light scattering, as well as increase in background 
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noise, all of which can lower image resolution. 
Various PSF engineering methods have been de- 
veloped to allow high localization precision over 
substantially longer focal depths (19, 21, 30). 
Adaptive optics have been used to correct for 
aberrations in super-resolution imaging of thick 
samples (29), and light-sheet illumination provides 
an effective optical-sectioning approach to reduce 
background in thick-sample super-resolution imag- 
ing (36). Tissue clearing methods can reduce not 
only aberration but also scattering and are par- 
ticularly powerful for thick-sample imaging (37). 
Alternatively, serial physical sectioning has also 
been used to reconstruct super-resolution images 
over large volumes of tissue (38). 


Live-cell imaging, temporal resolution, 
and phototoxicity 


Several super-resolution methods have dem- 
onstrated live-cell imaging. As scanning-based 
techniques, STED and RESOLFT can image a 
relatively small field of view (FOV) with very 
high temporal resolution, and thus have the 
impressive capability of probing millisecond 
dynamics of cellular structures at the spatial 
resolution of tens of nanometers (39). Although 
the time resolution decreases with increasing 
FOVs, highly parallelized RESOLFT with 100,000 
intensity minima, effectively akin to 100,000 
tightly spaced donut patterns, allows subsecond 
time resolution for large FOVs (40). 

As widefield imaging methods, the time res- 
olutions of STORM and PALM do not change as 
rapidly with the FOV size. Subsecond time res- 
olution at ~20- to 30-nm spatial resolution has 
been achieved for large FOVs in live-cell imaging 
by STORM using fast-switching dyes (42) and fast 
scientific-CMOS cameras (42). Several recently 
developed algorithms to localize a high density 
of molecules with overlapping images (43) can 
further increase the time resolution of these meth- 
ods. In addition, the single-particle-tracking mode 
of PALM, STORM, and PAINT (44, 45) allows 
movement of individual molecules to be tracked 
with time resolutions of milliseconds to tens of 
milliseconds at high molecular concentrations. 
The ability of MINFLUX to achieve high local- 
ization precision with a minimal photon budget 
has led to a drastic increase in the tracking time 
resolution of molecules in live cells to the submil- 
lisecond scale (~100 us), coupled with a correspond- 
ing increase in the number of snapshots possible 
for each molecule before photobleaching (15). 

For live-cell imaging, in addition to achieving 
high spatiotemporal resolution, it is also impor- 
tant to reduce photobleaching and phototoxicity 
to prolong the overall duration of imaging and to 
minimize perturbations to the biological systems. 
Because of the trade-offs between spatial and 
temporal resolutions and between spatiotemporal 
resolution and phototoxicity/imaging duration, 
it is possible to reduce the spatial resolution of 
both the patterned-illumination-based methods, 
such as STED and RESOLFT, and the single- 
molecule-switching-based methods, such as STORM 
and PALM, to trade for higher time resolution, or 
lower phototoxicity and longer imaging duration. 
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In addition, adaptively changing the intensity of 
the STED beam based on the presence or absence 
of fluorophores, as in DyMIN, results in a sub- 
stantial reduction in photobleaching and photo- 
toxicity (46). By using photoswitching instead of 
stimulated emission, RESOLFT requires a much 
lower light intensity than STED, and thus dras- 
tically reduces phototoxicity in live-cell super- 
resolution imaging (4). When the spatial resolution 
requirement is not particularly high (~100 nm), 
SIM is a popularly used live-cell imaging method 
because of its capability for high-speed widefield 
imaging with low phototoxicity. The recently re- 
ported PA NL-SIM demonstrated live-cell imaging 
with ~60-nm spatial resolution and subsecond 


time resolution over large FOVs and tens of time 
points (25). In general, using light-sheet illumina- 
tion for optical sectioning can also reduce photo- 
toxicity in imaging (36). The recently developed 
lattice light sheet microscopy (26) further decreases 
phototoxicity and improves optical sectioning (to 
~300 nm) compared to previous light-sheet schemes, 
and has been used in conjunction with super- 
resolution approaches to improve their volumet- 
ric live-cell imaging capability (25, 26, 35). 


Quantitative biological insights offered 
by super-resolution imaging 


Super-resolution imaging has transformed our 
understanding of biological systems and the ap- 


plications are rapidly expanding, prohibiting 
comprehensive descriptions in a short review. 
Instead, we will highlight in this section the types 
of quantitative insights that can be obtained by 
super-resolution imaging, with representative 
examples illustrating each case (Fig. 1). In the 
next section, we will provide more detailed de- 
scriptions of a few examples to further illus- 
trate the power of super-resolution imaging 
(Figs. 2 to 4). 


Spatial organization and molecular 
interaction of cellular structures 


The nanometer-scale resolution afforded by super- 
resolution imaging has substantially advanced our 
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Fig. 2. The membrane-associated periodic skeleton (MPS) in neurons 
discovered by super-resolution imaging. (A) Quasi-1D periodic MPS 
observed in axons by using STORM. Left: Comparison of diffraction-limited 
(top) and 3D-STORM (bottom) images of actin in axons. STORM image shows 
the periodic distribution of actin rings along the axon that is obscured by 
diffraction-limited imaging. Middle: Two-color STORM images showing the 
periodic distributions of and spatial relationship between actin, spectrins 

(BII- and BIV-spectrin), and voltage-gated sodium channels (Nay). Right: 
Schematic of the 1D MPS structure showing the organization of actin, spectrin 
tetramers, and adducin. Modified from (47). (B) Top: Schematic of a node of 
Ranvier. Center: STED image showing the periodic distribution ankyrin-G 
(AnkG) on the 1D MPS structure at a node of Ranvier. Modified from (74). 
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Bottom: STED image showing the periodic distribution of the adhesion 
molecule Caspr on the 1D MPS structure observed flanking a node of Ranvier. 
Modified from (76). (C) MPS structures observed in dendrites: Top: 1D MPS ina 
dendritic region observed by STORM imaging of ll-spectrin. Modified from 
(72). Upper middle: 1D MPS observed in a dendritic region by STED imaging of 
actin. Modified from (73). Lower middle: 2D polygonal lattice-like arrangement 
of MPS components observed in a dendritic region by STORM imaging of 
actin. Bottom: A magnified region of the STORM image (left) and the 
corresponding 2D autocorrelation analysis (right) are shown. Modified from 
(77). (D) 2D MPS observed on the soma of neuron by STORM imaging of 
Blll-spectrin (top right) along with a 2D autocorrelation analysis of the boxed 
region (top left). Bottom: Schematic of the 2D MPS structure. Modified from (77). 
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ability to interrogate the spatial organization of 
molecular structures in cells (9, 0). In addition, 
multicolor super-resolution imaging has allowed 
molecular interactions to be examined at unprec- 
edented resolution (9, 10). With these abilities, 
super-resolution imaging has provided new in- 
sights into numerous cellular structures, and even 
led to discoveries of previously unknown cellular 
structures, such as the membrane-associated peri- 
odic skeleton (MPS) in neurons (47) detailed in 
the next section. 

At the cell surface, membrane proteins such as 
receptors, channels, vesicle scission proteins, and 
viral fusion proteins have been investigated by 
various super-resolution approaches and are often 
found to assume functionally important spatial 
organizations. For example, it was shown that 
the SNARE complex component syntaxin-1 is 
densely packed within discrete clusters that are 
regulated by the lipid composition (48). The HIV 
envelope protein (Env) was observed to reorganize 
upon maturation, which is important for viral 
entry (49) (Fig. 1A, i), whereas the ESCRT com- 
plex is localized to the virus budding site and 
plays an important role in HIV budding (50) 
(Fig. 1A, ii). The calcium channel CatSper was 
shown to adopt a linear-domain organization 
along the sperm tail together with other sig- 
naling and scaffolding molecules, playing an 
important role in calcium signaling and sperm 
activity (57) (Fig. 1A, iii). In the cytoplasm, super- 
resolution imaging has provided new insights 
into the organization of cytoskeleton structures 
and membrane organelles, as well as other 
molecular assemblies. In addition to the discovery 
of the MPS in neurons (47), as will be detailed in 
the next section, novel organization has also been 
observed for other cytoskeletal structures, such as 
the ParA/ParB system in bacteria (52) (Fig. 1A, iv) 
and focal adhesions connecting the cytoskeleton 
to the plasma membrane (53). For membrane 
organelles, super-resolution imaging has revealed, 
for example, densely packed and dynamic tubular 
structures in endoplasmic reticulum (ER) sheets 
(35) (Fig. 1A, v), ring structures of Bax on apoptotic 
mitochondria (54, 55) (Fig. 1A, vi), and synergistic 
interactions between mitochondria and puri- 
nosomes (56) (Fig. 1A, vii). In addition to protein 
structures, super-resolution imaging has also 
provided new insights into RNA distributions 
and interactions in cells (57, 58). In the cell 
nucleus, super-resolution imaging has revealed 
interesting organizations of DNA and DNA- 
interacting proteins, such as the TRF2-dependent 
telomere loop (t-loop) formation important for 
DNA end protection (59) (Fig. 1A, viii), distinct 
chromatin organization and compaction in dif- 
ferent epigenetic states (60) (Fig. 1A, ix), and 
cell-type-dependent organizations of nucleo- 
somes (61). 


Stoichiometry of molecular complexes 


Although measuring stoichiometry by spatially 
resolving individual subunits within molecular 
complexes is still challenging, the ability to 
activate and localize individual molecules by 
PALM and STORM has triggered growing interest 
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in stoichiometric characterizations within intact 
cells. However, it is important to note that the 
number of measured single-molecule localiza- 
tions is not equivalent to the number of mol- 
ecules because of two complications. The first 
arises from imperfect labeling. New labeling ap- 
proaches, such as using gene editing to label 
endogenous proteins with rapidly maturing, 
monomeric fluorescent proteins or with protein 
or peptide tags that can be conjugated to dyes 
with near 100% efficiency, can help mitigate 
this challenge. The second complication arises 
from complex fluorophore switching: No dye or 
fluorescent protein has the ability to give precisely 
one localization per molecule because fluoro- 
phores blink (multiple times), and most fluoro- 
phores also have an inactivatable fraction. 
Multiple methods have been developed to com- 
bat this problem, including calibrations of 
fluorophores using standards of known stoi- 
chiometry or quantification and modeling of 
blinking properties (62-64). STED has also been 
used to quantify the number of molecules based 
on coincident photon detection (65). These meth- 
ods have been applied to quantifying, for example, 
the number of proteins in flagellar motors (62), 
receptor complexes (63), kinase complexes (66) 
(Fig. 1B, i), and secretion machinery (67) (Fig. 
1B, ii), as well as the number of lipid binding 
sites in endocytic vesicles (64) (Fig. 1B, iii). 


Temporal dynamics of cellular structures 


Super-resolution imaging has enhanced our 
ability to extract dynamic information of cel- 
lular structures, allowing the mobility of bio- 
molecules and the shape or structural dynamics 
of molecular complexes and organelles to be 
tracked with higher accuracy. For example, STED 
imaging has been used in combination with 
fluorescence correlation spectroscopy (FCS) to 
study the diffusion properties of molecules on 
the membrane. The drastic reduction in the re- 
gion of fluorescence emission by the STED beam 
has allowed the detection of membrane nano- 
domains <20 nm in size, within which different 
lipid molecules show distinct diffusion proper- 
ties (68, 69) (Fig. 1C, i). Super-resolution imaging 
has also enhanced our ability to perform single- 
particle tracking (SPT) in live cells. Conventional 
SPT experiments require a low labeling density 
for the molecule of interest to avoid signal overlap 
between molecules. Stochastically turning on only 
a subset of labeled molecules at a given time, as in 
PALM, STORM, and PAINT, allows SPT at much 
higher molecular concentrations at the endoge- 
nous expression level (34, 44, 45), facilitating the 
studies of gene expression, protein-nucleic acid 
interaction, and dynamic processes on cell mem- 
branes. With its unique capabilities, MINFLUX 
has allowed the tracking of ribosomes in bac- 
terial cells with unprecedented spatiotemporal 
resolution, achieving a localization precision of 
<50 nm with a time resolution of ~100 us (15) 
(Fig. 1C, ii). 

In addition, various super-resolution micros- 
copy methods have been used to measure struc- 
tural and shape dynamics of molecular assemblies, 
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organelles and small cellular compartments, 
such as the dynamics of neuronal processes and 
dendritic spines in tissue (70) and even in live 
animals (Fig. IC, iii) (77), fission and fusion dynamics 
of mitochondria (34) (Fig. 1C, iv), and structural 
dynamics of ER (35). 


Super-resolution studies of specific 
molecular assemblies 

The membrane-associated periodic 
skeleton in neurons 


Super-resolution imaging enabled the discovery 
of the membrane-associated periodic skeleton 
(MPS) in neurons, which was initially observed 
in the axons by STORM imaging (47). In the MPS, 
short actin filaments, capped by actin-capping 
proteins, such as adducin, are organized into 
repetitive, ring-like structures that wrap around 
the circumference of the axon; adjacent actin 
rings are connected by spectrin tetramers, form- 
ing a long-range quasi-1D periodic structure with 
a periodicity of ~180 to 190 nm underneath the 
axonal membrane (47) (Fig. 2A). 

The MPS spans the entire axon shaft, in both 
myelinated and unmyelinated axonal segments, 
including the axon initial segments (AIS) and 
nodes of Ranvier where action potentials are 
generated and amplified, respectively (47, 72-76) 
(Fig. 2, A and B). This structure was observed in 
all neuronal types examined, including excitatory 
and inhibitory neurons in both central and pe- 
ripheral nervous systems (74, 75), and is evolution- 
arily conserved across diverse animal species (75). 
Subsequent to its discovery in axons, this 1D 
periodic structure was also observed in dendrites 
by both STORM and STED (72, 73) (Fig. 2C), but 
the formation propensity and development rate 
of MPS appear to be lower in dendrites than in 
axons (77). In addition, a 2D polygonal lattice 
structure formed by MPS components was ob- 
served in the soma and dendrites (Fig. 2D), re- 
sembling the membrane skeleton structure 
observed in erythrocytes (77). 

This highly ordered submembrane skeletal 
structure can play diverse functional roles in 
neurons. It provides flexible mechanical support 
for axons that is likely critical for axon stability 
under mechanical stress (47); indeed, axons tend 
to break in spectrin-deficient animals under 
movement-induced stress (78). The MPS was 
also implicated in mechanosensation (79). More- 
over, the MPS organizes membrane proteins, such 
as ion channels and adhesion molecules, into 
periodic distributions along axons (47, 72, 73, 76), 
potentially influencing the generation and prop- 
agation of action potentials, and other signaling 
pathways in axons. The MPS also influences axon 
and dendrite morphology (80), is important for 
the formation of the AIS and nodes of Ranvier 
(72, 76, 80), and may also act as a diffusion bar- 
rier at the AIS (87). Disruption of the MPS causes 
widespread neurodegeneration and a range of 
neurological impairments (80), and mutations 
of MPS components are implicated in various 
neurodegenerative diseases. The discovery of 
the MPS, which escaped detection by previous 
imaging methods, demonstrates the power of 
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super-resolution imaging for uncovering new 
cellular structures. 


Molecular organization in synapses 


Neuronal synapses are typically only several 
hundred nanometers in size but contain 
elaborate protein machineries to orchestrate 
neurotransmitter-mediated signal transmission; 
hence, the structural interrogation of synapses 
requires high spatial resolution and has bene- 
fited from extensive super-resolution imaging 
efforts. For example, STED has revealed the spa- 
tial organization of several important components 
within the Drosophila neuromuscular junction, 
including the clustered organization of Ca?* chan- 


a 


Shank3 


nels, as well as the organization of scaffolding 
proteins required for both Ca?* channel clustering 
and synaptic vesicle tethering at the presynaptic 
active zone (82, 83) (Fig. 3A). STORM imaging has 
mapped the spatial organization of many proteins 
in the pre- and postsynaptic terminals, which 
show oriented organization of presynaptic scaf- 
folding proteins, laminar organization of post- 
synaptic density proteins, and synapse-to-synapse 
variability in the lateral distributions of neuro- 
transmitter receptors (84) (Fig. 3B). 

In addition, recent super-resolution studies 
revealed that the neurotransmitter receptors and 
postsynaptic scaffolding proteins adopt activity- 
dependent clustered organization (85, 86) (Fig. 


3C). Such clustered organization also extends 
across the synaptic cleft, giving rise to “nano- 
columns” formed by spatially aligned presynaptic 
vesicle fusion sites and postsynaptic receptor 
clusters (87) (Fig. 3D). This nanocolumn organiza- 
tion provides a mechanism for the coordination 
of synaptic vesicle release and neurotransmitter 
receptor response. 

Super-resolution studies of synapses have been 
recently extended to both proteomic-scale analy- 
sis of synaptic structures and circuit-scale analysis 
of synapse distributions. For example, STED has 
been used to image numerous protein compo- 
nents in the presynaptic terminals, creating a 
model of an “average” synaptic bouton (88). A 
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Fig. 3. Super-resolution imaging of synaptic structures. (A) Left: 3D STED 
images of the presynaptic active zone including Bruchpilot (Brp) and 
Drosophila RIM binding protein (DRBP), as well as the voltage-gated calcium 
channel Cacophony (Cac) at Drosophila neuromuscular junction synapses. 
Both axial (top) and radial (bottom) projections are shown. Right: Schematic of 
the active zone showing positions and orientations of components of the 
active zone cytomatrix including Brp, DRBP, and Cac in relation to the 
postsynaptic glutamate receptor (GluRIID) determined by using STED. 
Modified from (83). (B) Top: 3D STORM images of presynaptic protein 
Bassoon (red) and postsynaptic protein Homer (green). Two orthogonal axial 
views (left and middle) and the radial view (right) are shown. Center: Axial 
views of three synapses. In addition to Bassoon (red) and Homer] (green), a 
third color (blue) was used to map the positions of additional postsynaptic 
(Shank, left; GluR1, right) and presynaptic (Piccolo, middle) components at 
synapses. Bottom: Radial views of three example synapses showing differential 
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abundance and spatial distribution of neurotransmitter receptors, NR2B 

and GluR1. Modified from (84) with permission from Elsevier. (C) Radial 
projections of PALM images showing the clustered organization of post- 
synaptic proteins Shank3 (left) and Homerlc (right). Modified from (85) with 
permission from Elsevier. (D) STORM and PALM images show that areas of 
higher protein density (darker colors) of both presynaptic (RIM1/2, red) and 
postsynaptic (PSD-95, blue) components are often trans-synaptically aligned 
to form “nanocolumns” (indicated by filled arrows). Both axial (top) and radial 
(bottom) projections are shown. Modified from (87) with permission from 
Springer Nature. (E) STORM maximum intensity projection of a retinal ganglion 
cell (blue) with associated synapses marked by postsynaptic scaffolding 
protein gephyrin (green) and presynaptic proteins (Bassoon, Piccolo, 
Muncl3-1, and ELKS) (magenta), reconstructed from ultrathin serial 
sections. Inset shows a magnified view of a region of dendrite. Modified 
from (38) with permission from Elsevier. 
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volumetric STORM platform has been devel- 
oped to determine the entire synaptic fields of 
neurons (38) (Fig. 3E), providing synaptic con- 
nectivity at the neural circuit scale. 


Protein complexes with structural 
symmetry 


On the shorter length scale of individual protein 
complexes, it is possible to obtain higher-resolution 
reconstructions from many super-resolution images 
through particle averaging in a way that is similar 
to electron microscopy (EM) reconstruction, es- 
pecially for structures with well-defined symmetry. 
Two notable examples are centrioles and nuclear 
pore complexes (NPCs). 

STED and STORM, the latter in combination 
with particle averaging, have been used to visu- 
alize the ninefold symmetry of centrioles (89, 90) 
(Fig. 4A). In addition to resolving this symmetric 
arrangement, super-resolution imaging has also 
been used to map the 3D organization of several 
centriolar proteins and determine the order of 
protein recruitment during centriole formation 
(91, 92). At centrosomes, centrioles are surrounded 


Centriole 
CEP164 


CEP164 


by the less structured pericentriolar material 
(PCM), and the radial distribution of proteins 
within the centrosome and PCM have also been 
mapped by STORM and SIM (93, 94) (Fig. 4, B 
and C). 

Similarly, STORM imaging showed the eight- 
fold radial symmetry of NPCs (95) (Fig. 4D). In 
combination with particle averaging, STORM 
allowed the positions of seven nucleoporin com- 
ponents to be determined with ~1-nm precision, 
which in turn allowed the orientation of the 
Nup107-160 subcomplex within the pore to be 
determined (96). These super-resolution pictures 
allowed discrimination between contradictory 
models of the structural organization of the 
NPC scaffold (96) (Fig. 4E). 


Outlook 


Super-resolution fluorescence microscopy has 
transformed understanding of the structure and 
function of many biological systems. However, 
challenges are still present, and to maximize the 
impact of super-resolution microscopy, further 
technological advancements are still needed. 
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Fig. 4. Super-resolution visualizing molecular complexes: Centriole-containing complexes and 


the nuclear pore complex. (A) Single STED image (left) and particle-averaged STORM image (right) of 
the centriolar protein (CEP164), showing a radial ninefold symmetry. STED image modified from (89) 
with permission from Elsevier; STORM image modified from (90) with permission from Springer Nature. 
(B) Three centriolar and pericentriolar proteins—Cent2, CEP152 and y-tubulin (TUBG1), imaged using 
SIM showing a concentric organization of the pericentriolar matrix. Modified from (93) with permission 
from Springer Nature. (C) Average distribution of C-terminus, central domain, and N-terminus of the 
pericentrin-like protein (PLP) demonstrating a radial, spoke-like orientation for PLP through the pericentriolar 
matrix as determined by SIM. Modified from (94) with permission from Springer Nature. (D) Single STORM 
image of the nucleoporin protein GP210 showing an eightfold symmetry within the nuclear pore complex 
of Xenopus oocytes. Modified from (95). (E) Left: Radial distribution of several nucleoporins including 
Nup133, Nup107, Nup160 (C-terminus), Nup37, Nup160 (N-terminus), Sehl, and Nup85, that comprise 
the Y-shaped Nup107-160 complex determined using STORM and particle averaging. Right: A projection 

of the electron density of the cytoplasmic ring of the NPC determined by EM is overlaid with two possible 
arrangements of the Nup107-160 complex, determined by super-resolution imaging. Each protein is 
represented by a colored dot corresponding to the color and radius in the graph (left). Modified from (96). 
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The spatial resolution achieved by super- 
resolution microscopy in biological systems 
typically ranges from 10 to 70 nm, larger than 
most biomolecules. Achieving true molecular- 
scale resolution (~1 nm) would allow molecular 
interactions and conformations to be directly 
probed inside cells, but remains a challenging 
task. In principle, the two main categories of 
optical approaches to overcome the diffraction 
limit, including the patterned-illumination-based 
methods represented by STED, RESOLFT, and 
NL-SIM and the single-molecule-switching-based 
methods represented by STORM, PALM, and 
PAINT, can both achieve unlimited high resolution. 
However, practical factors, such as the require- 
ment of increasing illumination intensity (in the 
former category) and increasing fluorophore pho- 
ton budget (in the latter category) for higher res- 
olution, limit the resolution that can be achieved. 
Reinspection of the fundamental principles of 
super-resolution methods can lead to powerful 
new innovations and concepts, as demonstrated 
by MINFLUX, which combines strengths from 
both approaches and achieves ultrahigh, pre- 
viously inaccessible, resolutions. In addition, these 
methods can be combined with ExM, an or- 
thogonal approach that achieves resolution 
increase through physical sample expansion, 
potentially leading to a direct multiplication 
in the fold increases in resolution that are 
separately achievable by individual methods. 
However, it is worth noting that the final image 
resolution is also limited by probe size and 
labeling density. Thus, to ultimately benefit from 
the ultrahigh resolution, parallel development in 
probes and labeling methods is needed to allow 
molecules in cells to be labeled with small- 
molecule probes with high efficiency. 

Furthermore, although super-resolution imag- 
ing has demonstrated subsecond and even milli- 
second time resolution in some cases, owing to 
the trade-off between spatial and temporal 
resolutions, the limited photon budget of the 
fluorophores, and phototoxicity to samples, 
live-cell imaging with high spatiotemporal res- 
olution for a long period of time remains difficult 
and an active area of development. In addition, 
in vivo super-resolution imaging deep inside tissues 
remains challenging, notwithstanding consider- 
able efforts combating tissue-induced background, 
aberration, and light scattering. 

Another challenge, but also an exciting new 
direction, is to increase the number of molecular 
species that can be simultaneously imaged. Cells 
contain thousands of distinct genes and other 
molecules that act collectively to give rise to 
behavior and function, yet multicolor imaging 
usually only allows simultaneous visualization 
of a few different molecular species. Recent ad- 
vances have broken new ground in this direction, 
and genomic-scale imaging is now within reach. 
For example, single-cell transcriptome-imaging 
methods have allowed RNAs of 1000 or more 
genes to be simultaneously imaged in individual 
cells by using multiplexed fluorescence in situ hy- 
bridization (FISH) (97, 98) or in situ sequenc- 
ing (99, 100). A similar level of multiplexity 


7 of 8 


810g ‘z 4equiaides uo /fi0 beweousios‘eoua!0s//:d}]y Wod} pepeojumMog 


may be achievable for DNA and proteins in the 
future. Combination of these approaches with 
super-resolution microscopy could potentially 
allow genomic-scale super-resolution imaging. 
Technologically, a major challenge in genomic- 
scale imaging is molecular crowding, which can 
prevent resolution of neighboring molecules by 
conventional imaging, and super-resolution micros- 
copy provides a promising solution. Biologically, 
the ability to image all molecules in a complex 
molecular machinery or in a whole signaling 
pathway, and ultimately at the whole-genome 
scale, will provide a comprehensive picture of the 
molecular basis of cellular behavior and func- 
tion. It is exhilarating to imagine how such a 
picture of a cell, with all molecules imaged at a 
resolution that allows direct inference of molec- 
ular interactions, would open new opportunities 
for understanding life at the molecular level. 
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Arbovirus risk in Brazil 


espite the existence of an effec- 

tive vaccine for yellow fever, 

there are still almost 80,000 

fatalities from this infection each 

year. Since 2016, there has been 
a resurgence of cases in Africa and 
South America—and this at a time 
when the vaccine is in short supply. 
The worry is that yellow fever will 


Seismic limits for hard 
and soft rock 


Induced earthquakes from oil, 
gas, and geothermal energy 
exploration projects can dam- 
age infrastructure and concern 
the public. However, it remains 
unclear how far away from an 
injection site an earthquake 
can still be triggered. Goebel 
and Brodsky looked at 18 dif- 
ferent earthquake-producing 
injection sites around the 
world to address this issue. 
Injecting fluid into softer layers 
increased the range for seis- 
mic hazard, whereas harder 
basement rock better confined 


888 


Yellow fever is moving from 
the forests to the cities. 


the fluid. These findings should 
be considered when regulating 
and managing projects with the 
potential to induce seismicity. 
—BG 


Science, this issue p. 899 


Perovskite/CIGS 
tandem cells 


Tandem solar cells can boost 
efficiency by using more of the 
available solar spectrum. Han 
et al. fabricated a two-terminal 
tandem cell with an inorganic- 
organic hybrid perovskite top 
layer and a Cu(In,Ga)Se, (CIGS) 
bottom layer. Control of the 
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roughness of the CIGS surface 
and the use of a heavily doped 
organic hole transport layer 
were crucial to achieve a 22.4% 
power conversion efficiency. 
The unencapsulated tandem 
cells maintained almost 90% of 
their efficiency after 500 hours 
of operation under ambient 
conditions. —PDS 

Science, this issue p. 904 


The structure of the 
genome 


Beyond the sequence of the 
genome, its three-dimensional 
structure is important in 


Published by AAAS 


spread from the forests to the cities, 
because its vector, Aedes spp. mosqui- 
toes, are globally ubiquitous. Faria et 
al. integrate genomic, epidemiological, 
and case distribution data from Brazil 
to estimate patterns of geographic 
spread, the risks of virus exposure, and 
the contributions of rural versus urban 
transmission (see the Perspective by 
Barrett). Currently, the yellow fever 
epidemic in Brazil seems to be driven 
by infections acquired while visiting 
forested areas and indicates spillover 
from susceptible wild primates. —CA 
Science, this issue p. 894; see also p. 847 


regulating gene expression. 
To understand cell-to-cell 
variation, the structure 
needs to be understood at a 
single-cell level. Chromatin 
conformation capture 
methods have allowed 
characterization of genome 
structure in haploid cells. 
Now, Tan et al. report a 
method called Dip-C that 
allows them to reconstruct 
the genome structures of 
single diploid human cells. 
Their examination of dif- 
ferent cell types highlights 
the tissue dependence of 
three-dimensional genome 
structures. —VV 

Science, this issue p. 924 
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ECOLOGY 
Fisheries management 
and human adaptation 


Finding effective ways to 
mitigate the future impacts of 
climate change on fisheries is 
critical. Previous efforts have 
not incorporated alternative 
human responses to climate 
change. These could limit or 
exacerbate ecosystem changes 
that affect fish stocks and 
population locations. Gaines 
et al. analyzed four fisheries 
management approaches that 
address fish stock—productivity 
adaptation and/or range-shift 
adaptation. They then applied 
these management scenarios to 
915 species stocks worldwide. 
Implementing proactive and 
adaptive fishery management 
approaches would bring about 
higher global profits (154%), 
harvest (34%), and biomass 
(60%) as compared with strate- 
gies reflecting no management 
changes. Addressing both range 
shifts and productivity changes 
lead to greater benefits as 
compared with targeting one 
challenge alone. —PJB 

Sci. Adv. 10.1126/sciadv.aao1378 

(2018). 


PAIN 
Adual-targeting painkiller 


Opioids are among the most 
effective treatments for severe 
pain. Their pain-relieving effects 
are mediated by activation 
of the mu opioid receptor 
(MOR). Unfortunately, selec- 
tive MOR agonists induce 
diverse side effects, includ- 
ing respiratory depression, 
tolerance, hyperalgesia, and 
dependence. Recently, activa- 
tion of the nociceptin/orphanin 
FQ peptide receptor (NOR) 
has been reported to enhance 
OR agonist-induced anal- 
gesia without producing side 
effects. Ding et al. developed a 
bifunctional MOR/NOR agonist 
called AT-121, which showed 
potent analgesic effects in 
nonhuman primates without 
inducing hyperalgesia, respira- 
tory depression, or dependence. 
Bifunctional MOR/NOR 
agonists might thus represent 
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a safe and effective pharmaco- 
logical tool for treating severe 
pain. —MM 

Sci. Transl. Med. 10, eaar3483 (2018). 


MOLECULAR MOTORS 
Tiny cargos ferried along 
a track 


Control of molecules at the 
nanometer scale requires 
motors that convert potential 
energy into movement. Qing et 
al. designed a small molecule 
that could hop along a track 

of cysteine residues within a 
membrane-embedded protein 
pore. The direction of proces- 
sive movement along the 
track was reversible, driven 

by an applied potential across 
the membrane. Cargos were 
attached to a carrier motor, 
and their position and chemical 
identity read out from changes 
in the current through the 
pore. These features enabled 
repeat observations of a single 
molecule as it moved back and 
forth on the track. -MAF 


Science, this issue p. 908 


CLIMATE CHANGE 
Warming, crops, and 
insect pests 


Crop responses to climate 
warming suggest that yields 
will decrease as growing- 
season temperatures increase. 
Deutsch et al. show that this 
effect may be exacerbated 
by insect pests (see the 
Perspective by Riegler). Insects 
already consume 5 to 20% of 
major grain crops. The authors’ 
models show that for the three 
most important grain crops— 
wheat, rice, and maize—yield 
lost to insects will increase by 
10 to 25% per degree Celsius of 
warming, hitting hardest in the 
temperate zone. These findings 
provide an estimate of further 
potential climate impacts 
on global food supply anda 
benchmark for future regional 
and field-specific studies of 
crop-pest-climate interactions. 
—AMS 

Science, this issue p. 916; 

see also p. 846 
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Yellowstone lake sediments 
reveals vegetation shifts 
over 18,000 years. 


PALEOECOLOGY 


Edited by Caroline Ash 
and Jesse Smith 


Climate change in a mountain ecosystem 


ossil pollen in lake sediments provides valuable records 
of past vegetation patterns and offers a baseline for 
assessing how vegetation responds to climate change. 
To assess how the vegetation composition and distri- 
bution of a mountain system has varied with climate 
change over 18,000 years, Iglesias et al. studied pollen 
sequences from lakes in the Greater Yellowstone Ecosystem 
of the United States. They found complex patterns, with 
long-term stability in some plant communities and rapid 
change in others. The present-day mixed conifer forest 
cover, known to be vulnerable to climate warming, is now 


more compressed in its elevatio 


n range than in previous 


postglacial millennia. These data provide a context for 
assessing future responses to climate change. —AMS 


J. Biogeogr. 45, 1768 (2018). 


NEUROSCIENCE 
Degrees of stress in 
neurodegeneration 


In the neurodegenerative 
disorder amyotrophic lateral 
sclerosis (ALS), the nuclear pro- 
tein called transactive response 
DNA binding protein of 43 kDa 
(TDP-43) accumulates in stress 
granules within the cytoplasm 
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of neurons and glia and is linked 
to disease pathology. McGurk 
et al. report that TDP-43 binds 
to poly(ADP-ribose) (PAR), 
which triggers phase separation 
of TDP-43 and its subsequent 
recruitment to stress gran- 
ules. Under short-term stress, 
phosphorylated TDP-43, which 
is considered a hallmark of dis- 
ease, is unexpectedly excluded 
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EDUCATION 


Disruptive classmates, 
long-term harm 


hildren who are behaviorally disrup- 
tive during primary school can have 
harmful impacts on their classmates 
into adulthood. Carrell et al. use 
data from Florida, USA, to show 
that a child who experiences domestic 
violence at home (a well-recognized proxy 
for that child demonstrating disruptive 
behavior such as bullying) can lower their 
classmates’ secondary-school math and 
reading test scores, lower their likelihood 
of enrolling in college, and reduce earn- 
ings in their mid-20s by 3%. Differential 
exposure to such classmates accounts for 
roughly 5% of the rich-poor earnings gap 
in adulthood. -BW 
Amer. Econ. Rev. www.aeaweb.org/ 
articles ?id=10.1257/aer.20160763 (2018). 


Exposure to disruptive 
behavior, like bullying, 
during primary school can 
have lifelong effects. 


from stress granules. This find- 
ing indicates that the granules 
initially prevent phosphorylated 
TDP-43 aggregation unless 
stress is prolonged. This work 
also points to an approach to 
ALS treatment by inhibition 
of PAR polymerase (PARP) to 
reduce PAR production. For 
instance, a small-molecule 
inhibitor of PARP that prevents 
cancer-cell proliferation also 
blocks cytoplasmic TDP-43 
aggregation. —LC 
Mol. Cell 10.1016/j.molcel.2018.07.002 
(2018). 


REPROGRAMMING 
FACTs behind control 
of cell fate 


As animals develop, their cells 
become progressively less 
plastic and follow defined 
functional destinies. Kolundzic 
et al. used a genetic screen 

of the worm Caenorhabditis 
elegans to uncover proteins that 
prevent cells from straying from 
their intended fate. They found 
that the histone chaperone 
FACT plays a regulatory part in 
an unexpected way: It is non- 
repressive and also promotes 
gene expression. FACT acts as 

a barrier to cell reprogramming 
by stabilizing gene expression 
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and thereby safeguarding cell 
identity. A germline-specific 
isoform of FACT ensures that 
cells with intestinal and germline 
programming confirm their fate 
and do not adopt a neuronal role. 
Furthermore, depletion of FACT 
in human fibroblasts enhances 
production of induced pluripo- 
tent stem cells, indicating that a 
conserved mechanism is at work 
to channel cell fate in animals. 
—BAP 

Dev. Cell 10.1016/j.devcel.2018.07.006 

(2018). 


FRAMEWORK MATERIALS 
Transversal zigzag linkers 


The linkers for metal-organic 
frameworks are usually biden- 
tate molecules (for example, 
dicarboxylic acids) connected 
by an organic group to create 
a linear or, in some cases, a 
bent geometry like isophthalic 
acid. Guillerm et al. explored 
the effect of a “zigzag” linker, 
trans,trans-muconic acid 
(tmuc), that forces an offset 
of inorganic building blocks. 
Reaction with ZrCl, formed 
the metal-organic framework 
Zr,0,(OH),(tmuc),(H,0)., 
which had an eight-connected 
bcu topology, a subset of the 
12-connected fcu topology seen 
with linear linkers. This beu 
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topology was maintained with 
linkers of even larger transversal 
width, such as azobenzene-3,3'- 
dicarboxylic acid. —PDS 

J.Am. Chem. Soc. 140, 10153 (2018). 


MICROBIOLOGY 
Impermanent permafrost 


Permafrost constitutes a 
quarter of Earth's surface and 
about half the buried ancient 
carbon. Thaw releases water, 
and, together with higher 
temperatures, this promotes 
microbial respiration. Thus, 
permafrost melt during global 
warming represents a threat 
for escalating greenhouse gas 
release. Muller et al. extracted 
2-meter core samples from 
Svalbard permafrost in Norway 
for 16S ribosomal RNA gene 
analysis. Sampling at 3-cen- 
timeter intervals, they noted 
distinctive strata of microbial 
communities. On thawing and 
subsequent incubation, each 
community showed different 
metabolic rates and different 
CO, fluxes. Within 24 hours, 
thawing the deepest permafrost 
layers released most CO,, but 
over a longer term, most CO, 
was produced under shal- 

low aerobic conditions. These 
Svalbard mineral soils also have 
high iron availability. Intimate 


Published by AAAS 


knowledge of the microbial, as 
well as the physicochemical, 
conditions prevailing in any spe- 
cific permafrost area is needed 
to accurately estimate CO, 
emission during anthropogenic 
climate warming. —CA 

Environ. Microbiol. 10.1111/ 

1462-2920.14348 (2018). 


QUANTUM COMPUTATION 
Trapped ions tackle 
chemistry 


Some of the most likely first 
applications of future quantum 
computers may be in quantum 
chemistry. Even with currently 
available quantum computers 
consisting of just a few qubits, 
it is possible to address certain 
simple problems, but most of 
the development has occurred in 
systems using superconducting 
qubits. Hempel et al. used up to 
four qubits encoded by trapped 
ions to calculate the ground- 
state energies of two simple 
molecules, H, and LiH. They 
made use of a hybrid classical- 
quantum method called the 
variational quantum eigensolver, 
which relegates parts of the com- 
putation such as preprocessing 
and optimization to a classical 
computer. —JS 

Phys. Rev. X 8, 031022 (2018). 
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CANCER 
Looping together genes 
in cancer 


A subset of human cancers 
are characterized by aberrant 
fusion of two specific genes. In 
some cases, the activity of the 
resultant fusion protein drives 
tumor growth. Most fusion genes 
in cancer appear to arise from 
simple reciprocal chromosomal 
translocations. Anderson et al. 
found that the characteristic 
fusion gene in a bone and soft 
tissue tumor called Ewing sar- 
coma is produced by a far more 
complicated mechanism (see 
the Perspective by Imielinski 
and Ladanyi). In nearly half of 
the tumors examined, the fusion 
gene was created by the forma- 
tion of dramatic genomic loops 
that disrupt multiple genes. 
These complex rearrangements 
occur in early replicating and 
transcriptionally active regions 
of the genome and are associ- 
ated with poor prognosis. —PAK 
Science, this issue p. 891; 
see also p. 848 


SIGNAL TRANSDUCTION 
Dynamics of cell signaling 
and decoding 


Defects in cellular signaling 
pathways, like those in some 
cancer cells, are often thought to 
result in increased or decreased 
steady-state signals that 
promote or inhibit cell prolifera- 
tion. But Bugaj et al. show that 
dynamic changes in the duration 
or frequency of a signal can also 
alter cellular responses (see 

the Perspective by Kolch and 
Kiel). They took precise control 
of signaling in cultured human 
or mouse cells with a light-con- 
trolled mechanism for activating 
and inactivating the guanosine 
triphosphate Ras. Known cancer 
mutations in components of the 
Ras-activated signaling path- 
way or inhibitors of particular 
pathway components altered 
signal timing and readouts. The 
modified dynamics changed 
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transcriptional outcomes and 

could inappropriately support 

cell proliferation. The ability 

to probe responses of signal- 

ing networks in this way may 

enhance understanding of bio- 

logical regulation and reveal new 

therapeutic targets. —LBR 
Science, this issue p. 892; 

see also p. 844 


BIOTECHNOLOGY 
Lineage tracing in mouse 
using CRISPR 


A homing guide RNA (hgRNA) 
that directs CRISPR-Cas9 to 

its own DNA locus can diver- 
sify its sequence and act as 

an expressed genetic barcode. 
Kalhor et al. engineered a mouse 
line carrying 60 independent 
loci of hgRNAs, thus generating 
a large number of unique bar- 
codes in various embryonic and 
extraembryonic tissues in fully 
developed mice. This method 
demonstrates lineage tracing 
from the very first branches 

of the development tree up to 
organogenesis events and was 
used to elucidate embryonic 
brain patterning. —SYM 


Science, this issue p. 893 


ORGANOMETALLICS 
Carbonyls in the s block 


Conventional wisdom in chem- 
istry distinguishes transition 
metals from other elements by 
their use of d orbitals in bonding. 
Wu et al. now report that alkaline 
earth metals can slide their 
electrons from s- to d-orbital 
bonding motifs as well (see the 
Perspective by Armentrout). 
Calcium, strontium, and barium 
all form coordination complexes 
with a cubic arrangement of 
eight carbonyl ligands and an 
18-electron valence shell. The 
compounds were character- 
ized in frozen neon matrices by 
vibrational spectroscopy and in 
gas phase by mass spectrometry. 
—JSY 


Science, this issue p. 912; 
see also p. 849 
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CLIMATE CHANGE 
Future predictions from 
paleoecology 


Terrestrial ecosystems will 
be transformed by current 
anthropogenic change, but the 
extent of this change remains a 
challenge to predict. Nolan et al. 
looked at documented vegeta- 
tional and climatic changes at 
almost 600 sites worldwide 
since the last glacial maximum 
21,000 years ago. From this, 
they determined vegetation 
responses to temperature 
changes of 4° to 7°C. They went 
on to estimate the extent of 
ecosystem changes under cur- 
rent similar (albeit more rapid) 
scenarios of warming. Without 
substantial mitigation efforts, 
terrestrial ecosystems are at risk 
of major transformation in com- 
position and structure. —AMS 
Science, this issue p. 920 


EVOLUTION 
Venoms yield their 
secrets 


Venoms can be deadly but, in 
the right hands, can perform 
important therapeutic functions. 
Research into potential drug 
leads from venoms has, however, 
been hampered by an inability 
to study venoms from animals 
that are small, rare, or difficult to 
maintain in the laboratory. Ina 
Perspective, Holford et al. high- 
light recent studies that have 
used genomics and other omics 
technologies to study venoms 
from a wide range of organisms, 
shedding light on the evolution- 
ary biology of venoms. The 
work is also providing important 
leads for the development of 
therapeutics and eco-friendly 
insecticides. —JFU 

Science, this issue p. 842 


CELL DEATH 


Casting NETs 

Gasdermin D (GSDMD), a pore- 
forming protein, has emerged 
as a key downstream effector 


Published by AAAS 


in pyroptosis, a form of cell 
death induced by intracellular 
lipopolysaccharide. Sollberger 
et al. found that GSDMD was 
activated in neutrophils during 
the generation of neutrophil 
extracellular traps (NETs). NETs 
are composed of chromatin and 
antimicrobial proteins and are 
cast by dying neutrophils ina 
process termed NE Tosis. While 
carrying out a chemical screen 
to identify molecules that block 
NETosis, the authors identified 
a pyrazolo-oxazepine scaf- 
fold—based molecule that binds 
GSDMD. Chen et al. also report a 
role for GSDMD in NETosis, and 
Rathkey et al. report necrosul- 
fonamide to be an inhibitor of 
GSDMD. —AB 


Sci. Immunol. 3, eaar6689, eaar6676, 
eaat2738 (2018). 


GPCR SIGNALING 
Melatonin meets diabetes 


Some of the single-nucleotide 
polymorphisms associated with 
type 2 diabetes occur in the 
gene that encodes the melatonin 
receptor MT,, a G protein—cou- 
pled receptor (GPCR). Karamitri 
et al. measured the spontaneous 
and melatonin-stimulated signal- 
ing of 40 different MT, variants. 
MT, variants with defective 
melatonin-stimulated G protein 
signaling and reduced spontane- 
ous B-arrestin recruitment were 
associated with the greatest risk 
for type 2 diabetes. These data 
may aid in developing specific 
type 2 diabetes treatments 
based on a patient's MT, variant. 
—JFF 

Sci. Signal. 11, eaan6622 (2018). 
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Rearrangement bursts generate 
canonical gene fusions in bone and 


soft tissue tumors 


Nathaniel D. Anderson, Richard de Borja*, Matthew D. Young*, Fabio Fuligni*, 
Andrej Rosic, Nicola D. Roberts, Simon Hajjar, Mehdi Layeghifard, Ana Novokmet, 
Paul E. Kowalski, Matthew Anaka, Scott Davidson, Mehdi Zarrei, Badr Id Said, 

L. Christine Schreiner, Remi Marchand, Joseph Sitter, Nalan Gokgoz, Ledia Brunga, 
Garrett T. Graham, Anthony Fullam, Nischalan Pillay, Jeffrey A. Toretsky, 

Akihiko Yoshida, Tatsuhiro Shibata, Markus Metzler, Gino R. Somers, 

Stephen W. Scherer, Adrienne M. Flanagan, Peter J. Campbell, Joshua D. Schiffman, 
Mary Shago, Ludmil B. Alexandrov, Jay S. Wunder, Irene L. Andrulis, David Malkint+, 


Sam Behjati;, Adam Shlien+ 


INTRODUCTION: Gene fusions are often 
disease-defining events in cancer. The muta- 
tional processes that give rise to fusions, their 
timing relative to initial diagnosis, and wheth- 
er they change at relapse are largely unknown. 
Mutational processes leave distinct marks in 
the tumor genome, meaning that DNA sequenc- 
ing can be used to reconstruct how fusions 
are generated. A prototypical fusion-driven 
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tumor is Ewing sarcoma (ES), a bone cancer 
predominantly affecting children and young 
adults. ES is defined by fusions involving EWSR1, 
a gene encoding an RNA binding protein, and 
genes encoding E26 transformation-specific 
(ETS) transcription factors such as FLI7. We 
sought to reconstruct the genomic events that 
give rise to EWSRI-ETS fusions in ES and chart 
their evolution from diagnosis to relapse. 


Relapse or metastasis 


Clinically 
undetectable 
tumor 
Detected 
relapse or 
metastasis 


oe 
EW. EWSRI-FLI1 


EWSRI-FLII fusion generated 
and other genic disruptions 


Timing of mutations in a patient with ES. The schematic shows genetic alterations in tumors at 
prediagnosis, diagnosis, and relapse. In many cases, the fusion gene that drives tumorigenesis 
(EWSRI-FLI1 or EWSR1-ERG) emerges via a sudden burst of genomic rearrangements involving 
multiple chromosomes and genes. This event, called chromoplexy (indicated by the starburst), 
happens early in the evolution of the disease in a prediagnostic lesion. After this event, the diagnostic 
and relapsed tumors evolve in parallel. In this model, the clone that would ultimately become the 
relapsed tumor was already present at the time of initial diagnosis, although it was undetectable. 


Anderson et al., Science 361, 891 (2018) 
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RATIONALE: We studied the processes under- 
pinning gene fusions in ES using the whole- 
genome sequences of 124 primary tumors. We 
determined the timing of the emergence of 
EWSRI fusions relative to other mutations. 
To measure ongoing mutation rates and evo- 
lutionary trajectories of ES, we studied the ge- 
nomes of primary tumors, tumors at relapse, 
and metastatic tumors. 


RESULTS: We found that EWSRL-ETS, the key 
ES fusion, arises in 42% of cases via complex, 
loop-like rearrangements called chromoplexy, 
rather than by simple reciprocal translocations. 
Similar loops forming canonical fusions were 
found in three other sarcoma types. Timing 

the emergence of loops 
revealed that they occur 
Read the full article 8 bursts in early replicat- 
at http://dx.doi. ing DNA, as a primary 
org/10.1126/ event in ES development. 
science.aam8419 Additional gene disrup- 
cteielasusheeaaehcialtls Mee ‘dane ave eenetaicd cot: 
currently with the fusions within the loops. 
Chromoplexy-generated EWSRI fusions appear 
to be associated with an aggressive form of the 
disease and a higher chance of relapse. Numerous 
mutations present in every cell of the primary 
were absent at relapse, demonstrating that the 
primary and relapsed diseases evolved indepen- 
dently. This divergence occurs after formation 
of an ancestral clone harboring EWSRI fusions. 
Importantly, we determined that divergence of 
the primary tumor and the future relapsed 
tumor occurs 1 to 2 years before initial diagnosis, 
as estimated from the number of cell division- 
associated mutations. 


CONCLUSION: Our findings provide insights 
into the pathogenesis and natural history of 
human sarcomas. They reveal complex DNA 
rearrangements to be a mutational process 
underpinning gene fusions in a large pro- 
portion of ES. Similar observations in other 
fusion-defined sarcoma types indicate that 
this process operates more generally. Such 
complex rearrangements occur preferentially 
in early replicating and transcriptionally active 
genomic regions, as evidenced by the addi- 
tional genes disrupted. EWSRI fusions arising 
from chromoplexy correlated with worse clin- 
ical outcomes. Formation of the EWSRI fusion 
genes is a primary event in the life history of 
ES. We found evidence of a latency period be- 
tween this seeding event and diagnosis. This 
is in keeping with the often-indolent nature of 
symptoms before clinical disease presentation. 


The list of author affiliations is available in the full article online. 
*These authors contributed equally to this work. 
}Corresponding author. Email: adam.shlien@sickkids.ca (A.S.); 
sb31@sanger.ac.uk (S.B.); david.malkin@sickkids.ca (D.M.) 
Cite this article as N. D. Anderson et al., Science 361, 
eaam8419 (2018). DOI: 10.1126/science.aam8419 
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Rearrangement bursts generate 
canonical gene fusions in bone and 


soft tissue tumors 


Nathaniel D. Anderson”, Richard de Borja’*, Matthew D. Young”*, Fabio Fuligni’™, 
Andrej Rosic’, Nicola D. Roberts*, Simon Hajjar’+, Mehdi Layeghifard’, 

Ana Novokmet'}, Paul E. Kowalski’, Matthew Anaka’, Scott Davidson*, Mehdi Zarrei’, 
Badr Id Said’, L. Christine Schreiner’, Remi Marchand’, Joseph Sitter’, Nalan Gokgoz®, 
Ledia Brunga’, Garrett T. Graham’, Anthony Fullam’, Nischalan Pillay®®, 

Jeffrey A. Toretsky’, Akihiko Yoshida”, Tatsuhiro Shibata”, Markus Metzler’, 

Gino R. Somers”*, Stephen W. Scherer’”””””*, Adrienne M. Flanagan®”°, 

Peter J. Campbell®””, Joshua D. Schiffman"*, Mary Shago”*, Ludmil B. Alexandrov’, 
Jay S. Wunder?®”’, Irene L. Andrulis®’*, David Malkin’?”?7+, 


Sam Behjati®?*{, Adam Shlien®?**+ 


Sarcomas are cancers of the bone and soft tissue often defined by gene fusions. Ewing 
sarcoma involves fusions between EWSRI1, a gene encoding an RNA binding protein, and 
E26 transformation-specific (ETS) transcription factors. We explored how and when 
EWSRI-ETS fusions arise by studying the whole genomes of Ewing sarcomas. In 52 of 124 
(42%) of tumors, the fusion gene arises by a sudden burst of complex, loop-like 
rearrangements, a process called chromoplexy, rather than by simple reciprocal 
translocations. These loops always contained the disease-defining fusion at the center, but 
they disrupted multiple additional genes. The loops occurred preferentially in early replicating 
and transcriptionally active genomic regions. Similar loops forming canonical fusions were 
found in three other sarcoma types. Chromoplexy-generated fusions appear to be associated 
with an aggressive form of Ewing sarcoma. These loops arise early, giving rise to both primary 
and relapse Ewing sarcoma tumors, which can continue to evolve in parallel. 


enomic rearrangements (structural var- 

iants) are a ubiquitous source of somatic 

mutation in human cancer. They arise 

from breaks in chromosomes, which are 

then aberrantly rejoined. Rearrangements 
may occur in isolation or in the context of com- 
plex genomic catastrophes that shatter chromo- 
somes [chromothripsis (J)] or join chromosomes 
in chains or loop structures [chromoplexy (2)]. 
Rearrangements can generate cancer-driving mu- 
tations through several mechanisms, including 
the formation of gene fusions. Typically, fusions 
are fashioned by translocations that are often 
reciprocal. An exception is the prostate cancer 
fusion gene, TMPRSS2-ERG, which can occur 
through chromoplexy (2). 

Oncogenic gene fusions are particularly com- 
mon in leukemia and bone and soft tissue tumors 
(3), often acting as the sole driver mutation and 
delineating clinically relevant tumor entities and 
subgroups. In leukemia, recombination-activating 
gene (RAG)-mediated structural variation has 
been identified as the leading mutational pro- 
cess that creates canonical gene fusions and 
drives oncogenesis through translocations and 
deletions (4). Here we sought to investigate pro- 
cesses and timing of oncogenic fusions in hu- 
man bone and soft tissue tumors. 


Anderson et al., Science 361, eaam8419 (2018) 


The starting point of our investigation was 
Ewing sarcoma (ES), a bone and soft tissue 
cancer predominantly diagnosed in adolescents 
and young adults. It represents the prototypical 
fusion-driven sarcoma, defined by fusions be- 
tween EWSRI, a gene encoding an RNA binding 
protein, and genes encoding E26 transformation- 
specific (ETS) transcription factors, including 
FLU and ERG (5). Although the downstream 
consequences of EWSRI-ETS fusion genes are 
well established (6), the timing and mechanism 
by which they arise are unknown. 


Burden and signatures of substitution 
mutations in ES 


We sequenced either the gene-containing por- 
tions or the whole genomes of 50 ES tumors 
and their matched normal DNA (complete 
sequencing details in table S1). We used a 
conventional analysis pipeline to call somatic 
substitutions and rearrangements, with addi- 
tional custom software to remove recurrent 
artifacts and sources of false positives (see 
methods and fig. $1). Overall, and consistent 
with previous reports (7-10), the ES genome 
is genetically quiet, with few somatic substitu- 
tions identified (median of <1 mutation/Mb; 
Fig. 1A). 
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We next asked if the collection of all mu- 
tations, when considered together, could help 
highlight consistent mutagenic processes in 
ES. We extracted mutational signatures using 
an established method that allows for the dis- 
covery of new signatures. Despite the young age 
of patients with ES, and overall low number of 
mutations, we found that the tumors contained 
at least seven distinct signatures, all of which 
matched patterns seen in adult cancers [Cata- 
logue of Somatic Mutations in Cancer (COSMIC) 
signatures 1, 2, 5, 8, 13, 18, and 31; Fig. 1B and 
fig. S2A] (7, 12). Two of these (signatures 1 and 
5) were nearly universal and associated with 
patient age. Signature 1 generated a steady rate 
of seven mutations per gigabase per year, which 
is similar to that of adult ovarian and breast 
cancers (fig. S2B) (13). An overview of the so- 
matic architecture and mutational signatures of 
each tumor in our discovery cohort is shown in 
Fig. 1, A to C (left panels, Toronto cohort). 


Chromoplexy rearrangement loops are 
common in aggressive ES 


Having observed few small mutations, we then 
focused our attention on structural rearrangements. 
We applied a bespoke analysis tool to detect 
clustered rearrangements from whole-genome 
data, defined as having an inter-rearrangement 
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distance of <10 kilobase pairs (kbp) (see meth- 
ods). Using a computational data structure that 
modeled adjacent breakpoints as vertices and 
interconnected rearrangements as edges in a 
graph, we uncovered several distinct configu- 
rations of rearrangement clusters (Fig. 1D). As 
expected, one configuration of rearrangement 
clusters was a result of reciprocal rearrange- 
ments, where there is an equal exchange of 
genetic material and overlapping breakpoints. 
These were isolated rearrangements that oc- 
curred without additional breakpoints nearby. 
A second configuration, seen in 14/24 tumor 
genomes, was a distinctive pattern of focal clus- 
tered events with nearly overlapping junctions, 
organized as closed loops (distance of <30 bp; 
Fig. 1D, red distribution). That is, if one follows 
these complex rearrangements across their mul- 
tiple constituent chromosomes, one is ultimately 
brought back to the point of departure. Impor- 
tantly, the loops were nearly always centered on 
EWSRI-ETS (Fig. 1E). These abutting rearrange- 
ments that occur in a loop resemble a pattern of 
chromoplexy, akin to the loops of the prostate 
cancer fusion gene, 7MPRSS2-ERG. Of note, the 
EWSRI-ERG fusion was always generated by a 
complex mechanism, whereas EWSRI-FLI arose 
with or without this mechanism (fig. S3A). This 
is likely due to the opposite gene orientation 
of EWSRI relative to ERG on their respective 
chromosome arms. A simple two-chromosome 
break rearrangement cannot place the genes in 
the correct transcriptional orientation, necessita- 
ting more complex chromosomal rearrangements 
for fusion formation. Other than this difference, 
chromoplexy in ERG- and FL/-driven tumors was 
very similar (fig. S3B). 

In all cases, we resolved the breakpoints and 
found positions largely consistent with type I 
or type II ES (74). In the most complex case of 
chromoplexy, up to 18 genes were brought to- 
gether with the canonical fusion on the same 
derivative chromosome (fig. S3C; the full list of 
genes affecting all samples is shown in fig. S3D). 
We validated chromoplectic looped rearrange- 
ments by deep sequencing or by cytogenetic 
analysis using standard G-banding and spectral 
karyotyping (methods and fig. S4). Using RNA 
sequencing, we found that chromoplectic loops 
universally disrupted the reciprocal fusion (FLIZ- 
EWSRD; 52% of the cancers with simple rear- 
rangements expressed the reciprocal fusion, but 
none of the chromoplectic tumors expressed it 
(n = 27, fig. S5). For further validation of chro- 
moplexy in ES, we reanalyzed a published, inde- 
pendent cohort of 100 ES patients, whose genomes 
had been sequenced, using our informatics pipeline 
(0). The somatic architecture and mutational sig- 
nature of the validation cohort is shown in Fig. 1, 
A to C (right panels, validation cohort). Both 
cohorts harbored copy number profiles consist- 
ent with previous reports (fig. S6) (10). With this 
series, the aggregated prevalence of chromoplec- 
tic EWSRI-ETS gene fusions was 42% (52/124). 

Patients with relapsed ES have a poor survival 
rate, and new prognostic markers are needed. We 
evaluated the association between chromoplexy, 
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patient outcomes, and known markers of worse 
prognosis. We found that higher overall genomic 
complexity, a marker of aggressive ES (JO, 15), was 
almost completely explained by chromoplectic re- 
arrangements (Fig. 1F). By contrast, there was no 
difference in the burden of nonchromoplectic rear- 
rangements. Similarly, 7P53 mutations, another 
established marker of poor prognosis (10, 16), 
were enriched in chromoplexy ES (16 versus 
3%, P < 0.05, Fisher’s exact test). There was no 
enrichment for CDKN2A or STAG2 mutations 
(fig. S7). Finally, and consistent with the above, 
patients with chromoplexy ES were more likely 
to relapse (54 versus 30%, P < 0.05, Fisher’s exact 
test), strongly suggesting that it marks a more 
aggressive variant of ES. 


Chromoplexy generates the key fusion 
in other bone and soft tissue tumors 


We next widened our search across four dif- 
ferent benign and malignant bone and soft tis- 
sue tumor types for which canonical gene fusions 
have been identified (table S2). We subjected 
13 tumors to high- or low-coverage whole-genome 
sequencing, and RNA sequencing where feasible. 
In three tumor types—chondromyxoid fibroma, 
synovial sarcoma, and phosphaturic mesenchy- 
mal tumors—we found that chromoplectic re- 
arrangements (occurring in a similar looped 
formation) did indeed generate canonical gene 
fusions (Fig. 2). Furthermore, in one of the 
chondromyxoid fibroma cases, the fusion emerged 
from chromothripsis across seven different chro- 
mosomes (fig. S8, CMF 2). Chromothripsis was 
seen in five ES cases, of which four involved the 
canonical fusion. Taken together, these findings 
in human bone and soft tissue tumors show that 
canonical gene fusions are frequently caused by 
complex rearrangement processes, predominantly 
chromoplexy but also chromothripsis. 

We examined the microanatomy of chromo- 
plexy fusion loops at base-pair resolution, com- 
paring ES to a published series of prostate cancers 
(2). EWSRI-ETS ES loops were less complex than 
TMPRSS2-ERG prostate cancer loops, with fewer 
rearrangements and individual loops involved 
in their generation (2 to 10 rearrangements in 
1 to 2 loops compared with up to 130 rearrange- 
ments in up to 25 loops in prostate cancer). This 
may be a consequence of the ES genome having 
a shorter time frame to mutate compared to the 
prostate cancer genome. Consistent with this pro- 
position, multiple independent chromoplexy loops 
can exist in older prostate cancers, compared to 
the one simple loop seen in ES (7). In contrast 
to ES, where chromoplexy is virtually synony- 
mous with the disease-defining fusion, several 
chromoplexy fusion loops occur in prostate cancer 
without necessarily forming the TMPRSS2-ERG 
fusion. When a loop was present in ES, it almost 
always generated the EWSRI-ETS fusion (47/52 
cases, 90%) (Fig. 3, A and B, and figs. S9 and S10). 


Transcriptional disruptions are 
associated with chromoplexy 


These loops also led to targeted disruptions or 
fusions between genes brought together directly 
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through chromoplexy (7 = 168 gene disruptions, 
and n = 47 fusions; Fig. 3C). Given that chro- 
moplexy appeared to mark an aggressive form 
of ES, we wondered if its gene expression pro- 
gram was globally different—above and beyond 
the immediate, focal structural consequences 
listed above. We identified 504 differentially ex- 
pressed genes in chromopletic ES compared to 
simple ES (P < 0.001, Student’s ¢ test; Fig. 3D). 
Gene set enrichment analysis of well-curated 
pathways (78) uncovered a significant enrich- 
ment of dysregulated genes in established can- 
cer hallmark pathways (table S3). 

Both prostate cancer and ES loops were 
characterized by focal intrachromosomal rear- 
rangements, so-called deletion bridges (2), that 
acted as local mediators of large-scale loops 
(illustrated in fig. S11). We found deletion bridges 
in ~60% (30/52) of chromoplectic ES. In contrast 
to the bridges observed in prostate cancer, more 
than a third of bridges are utilized in ES in a 
highly consistent manner. That is, if a deletion 
bridge was found in one component of the loop, 
it would occur on all chromosomes. For example, 
in sample 2226, we observed 13 rearrangements 
spanning three chromosomes, all of which in- 
volved deletion bridges. These bridged chromo- 
plectic rearrangements fused EWSRI-FLI and 
disrupted the neighboring gene, APIBI, as well 
as the known cancer gene ARIDIB. Thus, deletion 
bridges can create further oncogenic disruptions. 

We also observed a remarkable pattern of splic- 
ing, whereby the transcriptional machinery fur- 
ther refined the looped rearrangements found in 
the genome. In chondromyxoid fibromas with 
chromoplectic GRM17 fusions (3/4 cases), the re- 
arrangement breakpoint did not actually reside 
within the GRMI gene body. Rather, the break- 
point was found in the upstream gene SHPRH 
within a narrow window (fig. S8). Thus, chromo- 
plexy together with conventional splicing leads 
to the promoter swap that is characteristic of this 
tumor [see (19)]. Interestingly, we also observed the 
transcriptional generation of gene fusions in ES. 
Examination of the transcriptomic consequences 
of loops showed that genes that were uncon- 
nected at the DNA level were brought together, 
in cis, at the MRNA level. This included examples 
of the EWSRI-ETS fusion itself (Fig. 3D and fig. 
$12). In the cases reported here, no direct rear- 
rangement links EWSRI1 and FLIJ; however, they 
are linked via two rearrangements to a third locus. 
In this way, chromoplexy generates the canonical 
driver gene through a chromoplexy scaffolding event. 


Chromoplexy occurs early in 
tumor evolution through a 
replication-associated mechanism 


We next studied the timing of chromoplexy re- 
arrangements in tumor evolution. Chromoplexy 
may arise from a one-off sudden event, generat- 
ing many breakpoints simultaneously, or through 
stepwise progressive bursts of mutations in suc- 
cession (2). To differentiate between these two 
modes of evolution, we used DNA copy number 
profiling associated with the breakpoints of 
chromoplexy rearrangements to assess the copy 
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Fig. 1. Mutation landscape of ES. (A to 


Toronto Cohort 


Validation Cohort 


C) The initial cohort consisted of 50 
individuals with primary ES tumors, from 
which 23 tumors were whole-genome 
sequenced (Toronto cohort, left). One 
rearrangement screen sample (sample 
4462) is included in this figure. The 
validation cohort consisted of individuals 
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(age-associated, clock-like signature 1). 


Other signatures included 2, 5, 8, 13, 18, 
and 31. Signatures 2 and 13 are associated 
with the activity of the AID/APOBEC 
family of cytidine deaminase, whereas 
signature 5 is also clock-like in some 
cancers but less so in ES (11, 13). Signatures 
8 and 18 have an unknown molecular 
etiology; however, it has been suggested 
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arrows (14/24 for the Toronto cohort and 
38/100 for the validation cohort; E 
aggregated prevalence, 52/124). 

(D) Rearrangement breakpoint clusters. The 
aggregated density distributions of the 
genomic distance between consecutive 
rearrangement breakpoints are shown. 
Reciprocal breakpoints are close together 
(~107 bp) because there is an equal focus /erie 
exchange of genetic material arising from a + 
single break on each chromosome. 
Chromoplectic rearrangements (red) 
overlap this range owing to the proximity 
of breakpoints involved in looped 
rearrangements. Deletion-bridge (DB) chromoplexy (purple) is a looped 
rearrangement cluster in which a deletion spans two breakpoints, resulting in 
breakpoint distances that are farther apart (illustrated in fig. S11). Noncomplex 
breakpoints (simple SVs) are far apart (~10° bp). (E) Schematic diagram of 
chromoplexy fusion loops. Illustrative example of chromoplexy in ES shows 
three chromosomes undergoing double-strand breakage, shuffling, and 
religation in an aberrant configuration. This phenomenon generates the 
canonical fusion, EWSRI-FLII (ERG or ETV1), and disrupts a third locus, X, ina 
one-off burst of rearrangements. In reality, up to eight chromosomes may be 
disrupted in this looping pattern. A representative genome-wide Circos plot 
depicting genomic rearrangements in an ES tumor (from the discovery cohort), 
which are organized in a loop. (F) Genomic correlates and clinical impact of looped 
rearrangements. In genomes without rearrangement loops, only simple structural 
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Chromoplexy Fusion Loops 
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variants (SSVs) exist with an average rearrangement burden of seven rearrange- 
ments per sample. This rate is similar to the background SSV rate (determined 
by removing rearrangements involved in a loop) in genomes with rearrangement 
bursts (compare the two red lines). The additional complexity of looped 
rearrangements results in higher genomic instability in these tumors. The most 
common genomic alterations include somatic TP53 mutations, which are rare but 
enriched in patients with complex genomes (top pie chart, orange fraction; P < 
0.05, Fisher's exact test). EWS-ERG fusions are also rare, as they represent 10% of 
all ES diagnoses; however, all EWS-ERG fusion ES tumors are either chomothriptic 
or chromoplectic (middle pie chart, orange fraction). Lastly, patients with complex 
genomes tend to relapse (bottom pie chart, orange fraction; P < 0.05, Fisher's 
exact test). All the markers of aggressive disease (high genomic instability, somatic 
TP53 mutations, and relapse) are present in tumors with complex genomes. 
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number of neochromosomes. A low number of 
copy number states (three or fewer) is associated 
with a one-off mutational event, because break- 
age and ligation can only involve a small number 
of chromosomes inside a cell at any given time 
(20, 21). By contrast, stepwise progression would 
result in multiple copy number states owing to 
the possibility of copy number alterations arising 
within older copy number alterations. Chromo- 
plectic breakpoints involve many chromosomes 
and are not associated with any copy number 
alterations (fig. S13). That is, these looped re- 
arrangements across the genome are balanced. 
In addition, using a bespoke algorithm, we found 
that the allele frequency of chromoplectic break- 
points was higher than that of simple structural 
rearrangements, providing further evidence that 
these breakpoints occurred together and early in 
tumor development (methods and fig. S14). Given 
their extremely tight clustering, low number of 
copy number-state transitions, and consistent 
clonal variant-allele frequency, EWSRI-ETS loops 
are likely to have arisen from singular bursts of 
rearrangements. 

We then examined whether genomic regions 
of loop breakpoints share properties that might 
predispose these regions to simultaneous rear- 
rangement. We performed a comprehensive 
analysis of 38 genomic properties, including 
adjacency to histone marks, association with 
replication timing, and proximity to genes and 
repetitive or transposable elements (table S4). 
Of these properties, early replicating DNA and 
features consistent with this were the most 
strongly associated with chromoplexy loops (P < 
1.0 x 10-8, Mann-Whitney U rank sum test and 
Benjamini-Hochberg correction; Fig. 4, A and B). 
In notable contrast, neither nonlooped simple 
breakpoints of ES nor simulated simple break- 
points were significantly associated with repli- 
cation timing or, indeed, any other feature (see 
methods). Replication timing is known to be 
strongly correlated with gene activity, chromatin 
accessibility, and nuclear position (22). Accord- 
ingly, chromoplectic breakpoint positions were 
also strongly associated with high gene density 
and high GC content (fig. S15A). Conversely, 
lamina-associated domains, which are enriched 
in late-replication regions and repressive chroma- 
tin environments, were found to be negatively 
associated with chromoplectic rearrangements. 
These significant associations were upheld when 
breakpoints directly residing in EWSRI1, FLI, or 
ERG were removed from the analyses. Of note, 
the same associations were found for looped 
rearrangements of ETS-positive (ETS+) prostate 
cancers but not for simple prostate cancer re- 
arrangements (fig. SI5B). Of further interest, we 
noted that the genes affected by chromoplexy 
were among the most highly expressed in ES 
across all patients (top 20%; fig. S16). Most ex- 
pressed genes are found in early replicating DNA 
(23). These data are consistent with the proposed 
model of chromoplexy in which DNA is co- 
localized in transcription hubs, allowing for mul- 
tiple genes from many chromosomes to be broken, 
shuffled, and aberrantly ligated (2). 
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Fig. 2. Genomic catastrophes are common in sarcomas. Copy number profile for fusion-driven 
sarcomas with chromoplexy are shown. Rearrangements are colored red, and the loci with the 
canonical fusion are highlighted (blue box) and enlarged on the right. (A) Chondromyxoid fibroma 
(CMF) with chromoplexy. The genomic breakpoint lies in the upstream SHPRH gene, and the 
BCLAF1-GRMI1 fusion was detected by RNA sequencing. Additional complex CMFs, which also show 
a SHPRH genomic breakpoint but with the GRMI1 fusion found in the RNA, can be found in fig. S8. 
(B) Synovial sarcoma with chromoplexy. Chromoplexy generating the SS18-SSX1 pathognomonic 
canonical fusion is shown. (©) Phosphaturic mesenchymal tumor (PMT) with chromoplexy. Genome 
sequencing of PMTs revealed deletion bridges occurring across the genome at chromoplectic loci, 


generating the canonical FN1-FGFRI1 fusion. 


Mutation patterns of relapsed and 
metastatic ES 


As noted above, chromoplexy arises early in 
the evolutionary history of ES and portends a 
worse prognosis and possible relapse. The ge- 
netic makeup of relapsed ES is largely unknown 
because standard of care for ES does not typ- 
ically involve rebiopsy of the cancer when the 
disease returns or has metastasized. Therefore, 
whether further mutations—chromoplectic or 
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otherwise—emerge at relapse is unclear, because 
few samples have been available. We obtained 
samples from six relapse, metastatic, or second- 
ary tumors and performed whole-genome se- 
quence analysis as well as full mutation and 
signature analyses (Fig. 5A). This included two 
primary-metastasis pairs, two primary-relapse 
pairs, one unpaired metastasis, and one tumor 
in which ES was secondary to a different can- 
cer. Notably, every relapse or metastatic tumor 
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contained the chromoplexy-associated fusion, 
whether it was from a metastasis at the time of 
diagnosis or a relapse arising later (Fig. 5B). The 
pattern of point mutations was also distinct. 
There was an enrichment of signatures 8 and/or 
18, in addition to the clocklike signature seen at 
diagnosis, suggesting that new processes drive 
relapse and metastatic ES (Fig. 5B). For example, 
in one patient’s tumor, we found a pronounced 


Fig. 3. Characterizing chromoplexy loops 
that generate EWSRI-ETS in ES. (A) Patterns 
of looped rearrangements. Chromoplexy circos 
webs demonstrate that patterns of looped 
rearrangements are conserved across samples, 
whereas different genes or loci are affected in 
each cancer (black panels). In each web, indi- 
vidual samples are indicated by different colors 
and named in the gray panel). In all cases, 
central to chromoplexy fusion loops were the 
ey driver genes: EWSRI1 (blue), FLIZ (green), 
and ERG (purple). The most frequent patterns of 
chromoplexy in ES are those with a three-way 
ooping structure as well as the presence of 
deletion bridges. An enlarged Circos web can be 
found in fig. S9 for readability. Three samples 
have structures only involving EWSRI1, FLI1, and 
adjacent loci. Sample 4004 has deletion-bridge 
chromoplexy and is described in fig. S3C. 

B) Summary of chromoplexy types. A bar chart 
showing the number of rearrangements in a loop 
X axis) and the number of samples with that 
rearrangement pattern is shown. Other chro- 
moplexy web structures can be found in fig. S10. 
C) Transcriptional consequences of chromo- 
plexy: Gene disruptions and fusions. There are 
three mechanisms of gene dysregulation via 
RNA fusion when chromoplexy occurs. The first 
involves two genes (blue and purple boxes) 
brought together by chromoplectic rearrange- 
ments (black lines with arrowheads), leading to 
gene disruptions (first scenario, shown at the 
top) and in-frame fusions (Second scenario). 
This was detected in the 3/10 cases where 
there was genome (with chromoplexy) and 
transcriptome sequencing available. When 
RNA sequencing was not available, these were 
predicted to cause fusions (n = 47, excluding 
the EWSRI-ETS driver) and gene disruptions 
by fusing genes in opposite transcriptional 
orientation or fusing a gene to an intergenic 
sequence (n = 168). The second mechanism 
involves two chromoplexy genes brought 
together by a rearrangement at the genomic 
level, but one of the partner’s neighboring 
genes (green box) is transcriptionally fused to 
the other chromoplexy partner in its place 
(third scenario). This is also the predominant 
mechanism of GRMI1 fusion generation in 


chondromyxoid fibromas (fig. S8). The final mechanism of gene 
dysregulation occurs when chromoplexy facilitates the production of a 
fusion by acting as a molecular scaffold (fourth scenario; also 
illustrated in fig. S12). Two genes are both rearranged to a third locus 
(orange) and are then transcriptionally fused together. No direct 
genomic link exists between these two genes. These phenomena can 
only be detected if both whole-genome and RNA sequencing are 
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increase of COSMIC signature 31, which has been 
recently associated with exposure to platinum 
therapy in chronic myelomonocytic leukemia 
(24). Notably, our patient had been treated with 
carboplatin for retinoblastoma 3 years before 
diagnosis of ES. At least three other patients in 
the validation cohort had a similar signature in 
their ES, which may also have been treatment 
induced (Fig. 1B). 
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Early divergence and parallel evolution 
of ES tumors 

A predominant model for tumor progression 
posits that metastases originate directly from 
the primary tumor. The metastatic lesions may 
have acquired new mutations, but, because they 
are thought to be derived by linear clonal evo- 
lution, most of the genomic properties of the pri- 
mary tumor will be found in the metastasis (25). 
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available. (D) Transcriptional consequences of chromoplexy: Gene expres- 
sion. Volcano plot illustrating the differential gene expression 

in chromoplexy versus nonchromoplexy ES, revealing 504 differentially 
expressed genes. Points greater than 1 or less than —1 and above 1.3 
(as indicated by the red lines) are genes that are significantly differentially 
expressed (blue dots). Red dots highlight genes that are differentially 
expressed and involved in a cancer hallmark pathway. 
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A different model was suggested for ES, on 
the basis of mutation data from two primary- 
metastasis pairs whose exomes were sequenced 
(8). Specifically, it was proposed that metastases 
diverged from the primary tumor early, although 
the timing of this divergence was not established. 
We compared coding, noncoding, and structural 
rearrangements across the genome within four 
ES pairs. As is the case in most tumor types, 
relapse and metastatic ES tumors acquired many 
new mutations (on average, 50% were unique, or 
“private,” mutations). A notably high number 
of clonal mutations from the primary ES were 
lost in the relapse (average 20%), confirming that 
the latter diverged early, evolving in parallel. 
For example, a disruptive clonal PTEN inversion 
was found in all tumor cells of one primary ES 
but was absent from the relapse (Fig. 5C). We 
also confirmed the same model of parallel evo- 
lution in one additional primary-metastatic 
pair, profiled using microarrays (fig. S17). The 
clinical implication of this model is that one 
should be searching for therapeutically target- 
able mutations arising in parallel with those in 
the primary ES, using biomarkers like circulating 
tumor DNA, because these mutations would 
not necessarily be present in the primary tumor. 

To determine when the divergence of the lethal 
clone occurred, we used the number of COSMIC 
signature 1 mutations, which emerge at a steady 
rate in ES (see methods and fig. S18). We first 
confirmed our approach by comparing the num- 
ber of signature 1 mutations between established 
time intervals, such as the dates of diagnosis and 
recurrence. In all cases, the observed number 
of mutations was extremely close (75 to 90%) 
to what would be expected (fig. S18). Using the 
established rate, we calculated the amount of 
time between the divergence of the primary 
and relapse or metastatic tumors. Notably, the 
common ancestor in ES clonally diverges 1 to 
2 years before diagnosis. Therefore, the cells 
that give rise to the primary and relapse tumor 
can exist in the patient years before diagnosis, 
providing a window for early cancer detection 
and surveillance. ES is often difficult to diagnose, 
and the time to diagnosis is notoriously long 
(26). These findings provide a plausible bio- 
logical mechanism for this latency. 


Discussion 


Our analyses reveal rearrangement bursts (chro- 
moplectic loops) as a source of gene fusion in 
human bone and soft tissue tumors. ESs with 
complex karyotypes are associated with a poorer 
prognosis than those with simpler karyotypes 
(27), and here we show chromoplexy as the 
mechanism in 42% of tumors. It is possible 
that the chromoplectic tumor’s additional gene 
disruptions and fusions contribute to the dif- 
ference in patient survival. Our whole-genome 
sequence data support a model in which there 
is an early clone of ES, containing EWSRI-ETS 
and chromoplexy, arising at least 1 year before 
diagnosis, which gives rise to both the primary 
and metastatic or relapse tumors (Fig. 5D). 
Whether the bursts described here are chance 
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Fig. 4. Early replicating DNA and chromoplexy. (A) Heatmap of genomic property associations. 
The genomic properties listed in table S4 were calculated for all rearrangements in both cohorts. 
Complex rearrangements (chromoplexy and chromothripsis), exclusively, are strongly associated 
with early replication timing and other genomic features consistent with this feature (gene density, 
CpG density, Alu density, etc.). Table values are indicative of false discovery rate—corrected P values 
compared to a million random points in the genome. Blue highlights in the figure are indicative 

of a Cohen's d greater than or equal to 0.3. Bold boxes indicate a positive (red, enrichment) or 
negative (blue, depletion) association with the feature. All features were evaluated in 1-kb bins across 
the genome. For feature density metrics, associations were calculated in 1-Mb sliding windows 
centered in 1-kb bins. (B) Density distribution of the average wavelet-smoothed signal (WSS) and 
single-nucleotide variations (SNVs) on a representative chromosome. The average WSS, of 
replication timing, is plotted for a subset of chromosome 6 to illustrate changes between early and 
late replication timing and the co-association with mutations in ES. The positional variation of 
replication timing across the chromosome is depicted as changes in density and color. Point 
mutations peak in late-replicating regions (dip in WSS, light purple), whereas complex 
rearrangements peak in regions of early replication timing (peak in WSS, dark purple). 


events or driven by specific mutational processes, 
akin to the RAG machinery operative in leuke- 
mia, remains to be established. As an increasing 
and diverse number of tumor genome sequences 
become available, we may be able to define fur- 
ther rearrangement processes that underlie fusion 
genes and thus unravel the causes of fusion- 
driven human cancers. 
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Materials and methods 
Patient and sample collection 


ES tumor and matched blood samples were 
collected from the Hospital for Sick Children 
(SickKids) and Mount Sinai Hospital in Toronto, 
Canada, in accordance with each institution’s 
Research Ethical Board (REB) guidelines. De- 
tailed clinical information (age at presentation, 
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Fig. 5. Mutation signatures of relapse and 
metastatic ES tumors. (A) Prevalence of 
mutation signatures in relapse and metastatic 
(Met) tumors. Shared and private mutations for 
four primary-metastatic or relapse pairs are 
shown (first four columns). Signatures 1 and 

5 are common throughout, with signature 

5 contributing considerably to the mutations 
that arise at relapse. Signature 8 was also 
common throughout the cohort. One metastatic 
tumor (no paired primary) is also shown to have 
similar mutation signature patterns as other 
metastatic or relapse tumors. Lastly, a 
secondary ES tumor to a primary retinoblastoma 
(germline RB1 mutation identified) was also 
sequenced in this cohort. This patient harbored 
the rare signature 31, which likely resulted from 
the patient's prior exposure to carboplatin for 
their primary retinoblastoma (the only patient 
to receive this treatment in the Toronto cohort). 
(B) Phylogenetic trees of primary-relapse and 
primary-metastatic ES. Using the shared and 
private mutations, we identified the mutational 
order in ES. Known cancer-driver mutations 
(IDH1, TP53, etc.) arise early (shared branches). 
LOH, loss of heterozygosity; del, deletion; inv, 
inversion. (C) A clonal PTEN inversion. A PTEN 
inversion was found in the primary but not in the 
relapse tissue, suggesting that the inversion 
arose after early divergence of a common clonal 
ancestor. However, a pathogenic PTEN SNV can 
be found in the relapse tissue. Together, these 
point toward parallel, convergent evolution on 
this gene. (D) Proposed model of ES tumor 
evolution. After birth, signature 1 is operative in 
all somatic tissues throughout life. ES patients’ 
cells experience a replication-associated burst of 
rearrangements that generates the canonical 
fusion driver. Early somatic cancer gene 
mutations occur before clonal bifurcation. This 
occurs 1 to 2 years before an ES diagnosis; thus, 
the cells that would give rise to the relapse 
existed years before diagnosis. Signature 

5 contributes substantially to the number of 
mutations seen at relapse. 


gender, tumor site, stage, etc.) were obtained 
from the corresponding institutional tumor 
banks (table $5). Overall, the patients’ clinical 
features and demographics were typical of ES: 
the average age of diagnosis was 14.8 years (2.8 to 
36.6 years); the male to female ratio was 1.38:1; 
and 14 patients had relapsed, with 13 having died 
from their disease. Additional samples (n = 3) 
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were also obtained from Universitatsklinikum 
Erlangen, Erlangen, Germany. All metastatic or 
relapse ES tumors were collected from the 
SickKids tumor bank or the SickKids cancer se- 
quencing program (KiCS). Detailed information on 
KiCS is available at https://www.kicsprogram.com. 

Of the 25 high-coverage genomes sequenced, 
EWSRI-ETS fusions were detected in all patients 


31 August 2018 


Replication-associated 
rearrangement 
bursts generate EWS-ETS 
| 


Ones Signature 5 


t 
Early somatic = ; 
~-.., mutagenesis 


mutations 
(TP53, STAG2) 


Metastatic Tumor Relapse Tumor 
>1 year 


Early divergence of 
relapse/mét clone 


Time 


except for a 37-year-old individual who was in- 
stead found to have a FUS-ERG translocation. 
This patient’s gene expression profile (by RNA- 
seq) was also discrepant, so they were re- 
moved from subsequent analyses (fig. S19). One 
additional genome was removed due to poor 
sequencing quality. We also performed low pass 
(~10X) rearrangement screens on 19 ES samples. 
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However, as we required breakpoint resolution, 
all but one of the rearrangement screens were 
excluded from this study due to insufficient cov- 
erage (see table S1, orange row). Taken together, 
our discovery cohort consisted of individuals 
from which 23 standard genomes (30X to 60X) 
and one rearrangement screen genome (20X) 
were sequenced. The validation cohort consisted 
of individuals from which 119 tumor-normal sam- 
ples were sequenced by Tirode et al. (10), which 
we downloaded from the European Genome- 
phenome Archive (accessions: EGAS00001000855 
and EGAS00001000839). Of these, 19 patient 
samples were omitted either because the EWSRI- 
ETS fusion was not detected by our pipeline and 
manual inspection of the aligned reads, or be- 
cause they harbored an excess of artefactual small 
inversions or deletions. 


Code availability 


Custom code described here is available at 
github.com/shlienlab. 


High-throughput sequencing 
and alignment 


Exome, genome, and transcriptome (RNA-seq) 
sequencing were performed using established 
protocols on Illumina instruments. For exome 
and genomes, paired-end FASTQ files were 
aligned to the human genome (hg19/GRCh37) 
using BWA-MEM (v.0.7.8); Picard MarkDuplicates 
(v.1.108) was used to mark PCR duplicates. Indel 
realignment and base quality scores were recali- 
brated using the Genome Analysis Toolkit (v.2.8.1). 


Detection of high-quality somatic 
substitutions and rearrangements 


We detected somatic mutations using established 
tools [MuTect2 (part of GATK v.3.5) (28) and 
Delly v.0.7.1 (29)]. To evaluate and validate our 
WGS substitution pipeline, we used a “gold stan- 
dard” cancer genome tumor-normal dataset, 
COLO829 (30). Using this somatic reference 
standard, we determined our precision to be 
0.885 and our sensitivity to be 0.971. Copy num- 
ber was detected for genomes and rearrange- 
ment screens using BIC-seq v.1.2.1 (37). When no 
matched normal was available (in the case of 
rearrangement screens), an ES normal was used. 
We then developed custom code to increase spec- 
ificity of putative substitution and rearrange- 
ment detection, as follows: 

1. Somatic and depth filter. No mutation should 
exist in the matched-normal sequence. For sub- 
stitutions, we removed common single-nucleotide 
polymorphisms (SNPs) as previously described 
(32) and a required >10X coverage at the mutated 
locus (10-kb window), in both tumor and normal. 
For rearrangements, this filter required =4 dis- 
cordant read-pairs in the tumor. We then directly 
interrogated the normal BAM file, at each puta- 
tive somatic rearrangement to ensure no germline 
variants existed near the breakpoint, on either 
side of the rearrangement. 

2. Panel of normals filtering. To remove com- 
mon germline variants, we created a panel of 
normal, nonneoplastic, samples that had been 
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sequenced using the same technology and to a 
similar depth of coverage (n = 133). We removed 
any putative rearrangements if present in >2 
normals. For rearrangements, breakpoints must 
exist on both sides of the junction within a 1-kb 
window. We found that as we increased the 
number of normals in our panel, our specificity 
increased (fig. S1C). 

3. Quality-control filtering. Putative rear- 
rangements were removed if supported by reads 
with MAPQ < 30. Putative rearrangements were 
removed if they met any two of the following 
criteria: 

(i) Non-unique mapping. <70% of the reads 
at the locus map uniquely. 

(ii) Multimapping clusters. At the same locus 
(200 bp up and downstream), a pattern of mul- 
tiple overlapping groups of discordant reads 
whose paired-ends align to different chromo- 
somes (>3 reads in each group, mapping to 
>4 chromosomes). Seen in both the tumor 
and paired normal. 

(iii) High depth. Excessively high depth align- 
ments in difficult-to-align regions of the genome, 
as described (33). We apply a maximum depth 
threshold of d + 4*sqrt (d), where d is the average 
normal mean read depth of the chromosome in 
the corresponding normal. 

(iv) Low-complexity regions. Overlap with a 
highly repetitive sequence (using DUST (34) with 
score >60). 


Mutation signature extractions 
and analysis 


First, a de novo extraction was performed on 
the catalog of ES point mutations to produce 
novel consensus mutational signatures. These 
signatures were deciphered using a previously 
described computational framework that opti- 
mally explains the proportion of each mutation 
type found in the catalog and then estimates 
the contribution of each signature to the mu- 
tation catalog (17). Overall, we identified 11 con- 
sensus mutational signatures. Four of these 
signatures were previously found to be attri- 
buted to sequencing artifacts. We then com- 
pared our true consensus mutational signatures 
to the previously curated COSMIC list and 
quantified their similarity using a cosine sim- 
ilarity as previously done (13). We report >0.9 
cosine similarity between the Ewing signatures 
and the COSMIC list. 


Validation by targeted 
custom-capture sequencing 


A custom targeted-capture enrichment system 
was designed to capture 1 Mb of DNA (Nextera, 
Illumina) with custom probes for the whole of 
EWSRI, FLU, and ERG genes as well as the 
exons of TP53, STAG2, and ATRX. We also tar- 
geted known complex breakpoints from the 
discovery cohort, achieving between 900- to 
1000-fold coverage. We reasoned that paired- 
end sequencing would capture any locus joined 
to the three core genes, even if the panel did 
not specifically target it. In this way, we vali- 
dated rearrangements in samples where chro- 
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moplexy was already known from the whole 
genome and uncovered new instances in sam- 
ples that had not been whole-genome sequenced 
(n = 7 and 4, respectively). Each tumor had three 
or four rearrangements validated using the panel. 
All had the same breakpoint (as found by the 
whole-genome sequence) and were found to 
harbor looped rearrangements are on the same 
derivative chromosomes. 


Validation by FISH, G-banding, 
or spectral karyotyping 


We further validated these looped rearrange- 
ments by karyotyping ESs using standard G- 
banding as well as spectral karyotyping (n = 17 
and 3; fig. S4). By cytogenetics we found addi- 
tional complexity—beyond the canonical chr22- 
chri1 translocation—in eight cases. Of these, six 
tumors had been sequenced and found to be 
complex. Additionally, there were five cases for 
which chromoplexy was detected by genome 
sequencing yet not found by cytogenetics tech- 
niques, indicating that routine cytogenetics may 
miss chromosomal complexity present in these ge- 
nomes due to the nature of these submicroscopic 
complexities (fig. S20). 


Timing of rearrangements using 
breakpoint allele fraction 


To determine the timing of the chromoplectic 
loops, we developed a tool to accurately measure 
the breakpoint allele fraction (BAF) of each re- 
arrangement. The BAF is the proportion of reads 
containing a rearrangement breakpoint divided 
by the total number of reads, analogous to the 
variant allele fraction (VAF) for point mutations 
(illustrated in fig. S14A). This is analogous to the 
variant-allele frequency of substitution muta- 
tions and, similarly, can be used to infer the 
relative order of rearrangement mutations. The 
tool accurately counts all reads supporting each 
rearrangement, even if these had not been used 
to nominate the rearrangement in the first place. 
From the raw aligned reads, we first collected all 
split reads near the breakpoint (within 20 bp) 
from one side of the rearrangement. Next, we 
extracted the clipped sequence (i.e., the non- 
aligned portion) from these reads and attempted 
to map it to the other side of the rearrangement 
(within 70 bp of the breakpoint) using a Smith- 
Waterman algorithm (35). Clipped sequences 
shorter than 5 bp were discarded, as were those 
that failed to map to the other side of the re- 
arrangement (<80% similarity). Since the re- 
tained sequences can map at slightly different 
position, due to microhomology near the break- 
point, we considered all those close to one 
another as supportive of the same rearrange- 
ment. Overall, we found that most rearrange- 
ments are supported by remapped reads that are 
less than 10 bp apart. Finally, the total number 
of split and realigned reads were divided by 
the average coverage between the two break- 
points per side of each rearrangement. This 
allowed us to arrive at an accurate measure of 
the breakpoint allele fraction. To validate our 
tool, we applied it to a curated list of known 
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polymorphic copy number variants (CNVs) (36). 
As expected, the BAF of germline CNV deletions 
followed a bimodal distribution with peaks at 
0.5 and 1.0, for heterozygous and homozygous 
rearrangements, respectively (fig. S14B, green 
line). We then compared the BAF of somatic 
rearrangements in chains to those without. 
Chained rearrangements had higher BAFs than 
simple structural variants (fig. S14B, red versus 
blue line), confirming that chromoplectic rear- 
rangements are in fact earlier. 


Detection of breakpoint clusters 
of chained rearrangements 


Using their interbreakpoint distance, we iden- 
tified rearrangements within 10 kbp of one 
another. Using these, we created an undirected 
graph in which two rearrangement breakpoints 
within 10 kbp of one another (a breakpoint 
cluster) were represented as a vertex and con- 
nected to other breakpoint clusters (rearrange- 
ments are edges in the graph). We selected 
connected components of the graph and identified 
components with greater than one vertex as inter- 
connected rearrangements. In most of our cases, 
these interconnected rearrangements formed 
chains or loops, where one could follow the 
edges around the graph and return to the initial 
vertex of departure. These were further filtered 
for reciprocal rearrangements or overlapping 
intrachromosomal rearrangements. Chromoplexy 
rearrangements were validated by manual inspec- 
tion and using the ChainFinder algorithm (2). 


Association of rearrangements 
with genomic features 


We formally evaluated the association of re- 
arrangement position with 38 properties of the 
human genomes (table S4). We separately eval- 
uated each of these associations in 1-kb bins 
across the genome. Feature density properties 
were calculated as densities in various sliding 
windows (1 kb, 10 kb, 100 kb, and 1 MB) cen- 
tered on each 1-kb bin or as the log, distance, 
as indicated in table S4. The positions of ES 
rearrangements were compared to a million 
random positions that had been uniformly sam- 
pled from regions of the genome where confi- 
dent genotypes could be determined (i.e., the 
“callable” genome). We limited our analysis to 
chromosomes 1 to 22 and X. To test for signifi- 
cant associations between our rearrangements 
and these genomic properties, we performed a 
Mann Whitney U rank sum test and Benjamin 
and Hochberg FDR correction to raw P values. 
We used the Cohen’s d metric to determine the 
effect size between the two groups to account 
for differences in sample size. We applied an 
absolute Cohen’s d cut-off of 0.3, a medium 
effect size (37, 38). Genomic properties were 
considered significantly different between rear- 
rangements and random positions if absolute 
(d) = 0.3 and the corrected P < 0.05. 


Detection of gene fusions 


We detected gene fusions in regions of genomic 
complexity using an approach that integrates 
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multiple independent fusion algorithms, and 
then removed those found in normal tissue. 
Putative fusions were validated by de novo 
assembly. A total of 1277 normal (nonneoplastic) 
samples from 43 different tissues were obtained 
from the NHGRI GTEx consortium (database 
version 4) and used to remove artifacts. All 
fusions were visually inspected if one or both 
genes involved chromoplexy or were adjacent 
(up to 1 Mbp). Fusions were further filtered by 
quality of the realigned transcript, breakpoint 
coverage, and gene expression. 


Detection of gene expression 


Gene expression for fusions, differential gene 
expression analysis, and principal component 
analysis was performed using HT-seq (39) to 
count the reads aligning to every gene. PCR 
duplicates and reads mapping to ribosomal 
RNA, miRNA, and small nucleolar RNA were 
removed. We used the trimmed mean of M-value 
(TMM) method in the EdgeR package to per- 
form normalization on genes with at least 1 read 
per million bases in at least three samples 
(40, 41). Differential expression analysis in 
chromoplexy versus nonchromoplexy samples 
was performed using a generalized linear model 
(GLM) likelihood ratio test, taking in considera- 
tion different sources of variation like batch, 
gender and age. P values for the GLM test were 
adjusted for multiple testing using the Benjamini 
and Hochberg method for controlling the false 
discovery rate (FDR). Differentially expressed 
genes in chromoplexy versus nonchromoplexy 
were considered statistically significant if FDR < 
0.05 and absolute value of log(fold change) = 1. 
Pathway analysis was performed on genes dif- 
ferentially expressed in samples with and with- 
out chromoplexy using Gene Set Enrichment 
Analysis (GSEA) software (javaGSEA v2.2.4). 
Cancer gene signatures were selected from the 
hallmark collection from the Molecular Signa- 
ture DataBase (MsigDB) (/8). Enrichment scores 
for the hallmark pathways were considered sta- 
tistically significant if FDR < 0.01. 


Evaluation of replication timing 
in prostate cancer rearrangements 


We obtained prostate cancer rearrangements, 
including chained and others, from the Baca e¢ al. 
publication [supplemental tables S3C and S5 
from (2)]. Samples were annotated as “ETS+” or 
“ETS-—” using supplemental table S1 from (2). 
ETS+ fusions include any ETS fusion detected 
by sequencing (including ERG and ETV1). Using 
this list, we performed the same test for ge- 
nomic property enrichment as we did in ESs. 


Molecular inversion probe 
(MIP) microarray 


Raw MIP data from three additional primary- 
metastatic ES pairs were obtained from the 
Huntsman Cancer Institute, Salt Lake City, Utah 
(42). The original source material was clinically 
archived, formalin-fixed paraffin-embedded (FFPE) 
scrolls that were retrieved from three individ- 
ual patients diagnosed with ES. Primary tumor 
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samples were from diagnostic biopsies taken 
before chemotherapy. The raw MIP data from 
the completed assay was loaded into Nexus 
Copy Number (BioDiscovery, Inc., El Segundo, 
California) for copy number detection using 
default settings. 
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SIGNAL TRANSDUCTION 


Cancer mutations and targeted 
drugs can disrupt dynamic signal 
encoding by the Ras-Erk pathway 


L. J. Bugaj, A. J. Sabnis, A. Mitchell, J. E. Garbarino, J. E. Toettcher*, 


T. G. Bivona*, W. A. Lim* 


INTRODUCTION: Signaling pathways, such 
as the Ras-Erk (extracellular signal-regulated 
kinase) pathway, encode information through 
both their amplitude and dynamics. Differences 
in signal duration and frequency can lead to 
distinct cellular output decisions. Thus, tem- 
poral signals must be faithfully transmitted 
from the plasma membrane (Ras) to the nucleus 
(Erk) to properly control the cell’s response. 
Because the Ras-Erk pathway regulates impor- 
tant cell decisions such as proliferation, changes 
to dynamic signal transduction properties 
could result in improper cell decisions and 
dysfunction. However, it has been difficult to 
examine whether corruption of signal trans- 
mission dynamics is associated with diseases 
such as cancer. 


RATIONALE: We used optogenetic stimula- 
tion of the Ras-Erk pathway to quantitatively 
screen whether cancer mutations and drug 
treatments alter the fidelity of dynamic signal 
transmission. Most cancer-associated mutations 
in the Ras-Erk pathway are thought to drive 
cancer by inducing constitutive pathway 
activation—a high basal amplitude of activity. 


Optogenetic profiling of cancer cells reveals perturbed signal transmission dynamics that 
Optogenetic stimulation of Ras allows precise profiling of the fidelity of Ras-Erk pathway signal 


We explored whether cancer cells might also 
have altered dynamic properties that could con- 
tribute to disease. We used live-cell microscopy 
and new high-throughput optogenetic devices 
to systematically measure cell responses to a 
broad range of dynamic input stimulus pat- 
terns. We could detect subtle but important 
perturbations in pathway signal transmission 
properties by monitoring how these upstream 
stimulus patterns (generated by use of Ras- 
activating optoSOS) altered pathway output 
at the downstream levels of signaling, gene 
expression, and cell proliferation. 


RESULTS: We found that cells that harbor 
particular B-Raf mutations (in the kinase 
P-loop) exhibit substantially corrupted dynamic 
signal transmission properties. In particular, 
the kinetics of Ras-Erk pathway inactivation 
are substantially slowed (half-time for signal 
decay is 10-fold longer). In these cancer cells, 
the active Erk output signal remains abnor- 
mally high for ~20 min after Ras input ac- 
tivity (optoSOS) is withdrawn (compared with 
1 to 2 min for normal cells). Mutants or drugs 
that enhance B-Raf dimerization led to sim- 


ilar slow pathway deactivation. We could pin- 
point B-Raf as the node responsible for altered 
transmission by using a combination of small 
molecular inhibitors and optogenetic stimula- 
tion at alternative input points. 

Elongated pathway decay kinetics resulted 
in physiologically important cellular misinter- 
pretation of dynamic inputs. In response to 
pulsatile inputs with intermediate frequencies, 
the perturbed cells responded with transcrip- 
tional profiles typically observed with sustained 
inputs. This signal misinterpretation propa- 

gated to proliferative deci- 
sions, resulting in aberrant 
Read the full article cell-cycle entry in response 
at http://dx.doi. to otherwise nonprolifera- 
org/10.1126/sci- tive pulsatile inputs. These 
ence.aao3048 changes in pathway trans- 
mission shift the threshold 
of temporal input patterns that can drive cell 
proliferation, so that a space of inert input pat- 
terns that are normally filtered by the pathway 
can now drive proliferation. 


CONCLUSION: Cancer mutations and targeted 
drugs can corrupt dynamic transmission prop- 
erties in signaling pathways, shifting cellular 
response thresholds and changing cell decisions 
in a potentially pathological manner. Opto- 
genetic approaches, especially in a high-throughput 
format, can be a powerful tool with which to 
systematically profile how a cell transmits and 
interprets information. We anticipate that fur- 
ther understanding the landscape of such func- 
tional alterations may help us mechanistically 
understand, stratify, and treat diseases that 
involve corrupted cellular decision-making. 


The list of author affiliations is available in the full article online. 
*Corresponding author. Email: wendell.lim@ucsf.edu (W.A.L.); 
toettcher@princeton.edu (J.E.T.); trever.bivona@ucsf.edu (T.G.B.) 
Cite this article as L. J. Bugaj et al., Science 361, eaa03048 
(2018). DOI: 10.1126/science.aao3048 


can drive improper proliferation. 


ing in normal and cancer cells. We found 


that cancer cells with certain BRAF mutations have dramatically altered signal transmission dynamics compared with normal cells. These 
altered dynamics lead to a loss of temporal input resolution, so that the cancer cell may now misinterpret nonproliferative pulsatile 


input patterns as a trigger to proliferate. 


Optogenetic probing of 
Ras-Erk signal transmission 
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Cancer mutations and targeted 
drugs can disrupt dynamic signal 
encoding by the Ras-Erk pathway 


L. J. Bugaj'*, A. J. Sabnis”*?, A. Mitchell’|, J. E. Garbarino*, J. E. Toettcher't§, 


T. G. Bivona’®*§, W. A. Lim?**§§ 


The Ras-Erk (extracellular signal-regulated kinase) pathway encodes information in its 
dynamics; the duration and frequency of Erk activity can specify distinct cell fates. To 
enable dynamic encoding, temporal information must be accurately transmitted from the 
plasma membrane to the nucleus. We used optogenetic profiling to show that both 
oncogenic B-Raf mutations and B-Raf inhibitors can cause corruption of this 
transmission, so that short pulses of input Ras activity are distorted into abnormally long 
Erk outputs. These changes can reshape downstream transcription and cell fates, 
resulting in improper decisions to proliferate. These findings illustrate how altered 
dynamic signal transmission properties, and not just constitutively increased signaling, 
can contribute to cell proliferation and perhaps cancer, and how optogenetic profiling can 
dissect mechanisms of signaling dysfunction in disease. 


ignaling through the Ras-Erk (extracellular 

signal-regulated kinase) pathway controls 

diverse cell decisions, including survival, 

differentiation, and proliferation (7). A cell’s 

fate is determined in part by the dynamics 
of Ras-Erk signals, which can be encoded by dif- 
ferent receptors or cellular contexts (Fig. 1A) (2-6). 
Thus, the cell must be able to accurately transmit 
dynamic signal patterns and then decode them 
to make proper decisions (7-15). To understand 
how the cell transmits and decodes dynamic in- 
formation, we recently developed optogenetic 
methods with which to interrogate cells with 
precisely controlled Ras inputs (Fig. 1B) (16). Our 
study revealed that the Ras-Raf-Mek-Erk protein 
kinase cascade acts as a high-fidelity transmission 
system, accurately transmitting dynamic signals 
with time scales ranging from minutes to hours 
to the nucleus [through nuclear localization of 
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phosphorylated Erk (ppErk)] (Fig. 1A). In turn, 
downstream transcriptional networks then de- 
code and integrate Erk dynamics to yield dis- 
tinct cellular responses (2, 16-19). 

Given the functional importance of Ras-Erk 
dynamics, we realized that changes in how the 
pathway transmits and decodes signals could 
potentially lead to cellular malfunction and dis- 
ease. Mutations within the Ras-Erk pathway 
underlie a large proportion of human tumors 
(20), and these mutations are commonly thought 
to drive cancer phenotypes through constitutive 
proliferative signaling. However, cancer pheno- 
types might also result from the corruption of 
proper dynamic signal transmission and decoding. 
Such changes could result in misinterpretation 
of dynamic environmental signals that might, for 
example, instruct cells to proliferate in response 
to normally nonproliferative inputs. Detecting 
potential defects in signal transmission and filter- 
ing requires appropriate tools that have only re- 
cently become available. We applied optogenetic 
profiling to identify alterations in Ras-Erk signal- 
ing dynamics within cancer cells, and we showed 
how these changes can result in inappropriate 
cellular decision-making. 


Optogenetic profiling of Ras-Erk signal 
transmission in lung cancer cells 


We examined Ras-Erk signaling in five patient- 
derived non-small cell lung cancer (NSCLC) cell 
lines with endogenous, validated oncogenes in 
the Ras-Erk pathway [in the epidermal growth 
factor receptor (EGFR), Ras, and B-Raf] (Cell 
lines and putative driver mutations are listed in 
table S1). As controls, we examined two normal 
human lung epithelial cell lines (Beas2B and 
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16HBE) and mouse NIH 3T3 fibroblasts. To 
probe how these cells processed dynamic Ras 
signals, we transduced each cell line with optoSOS, 
a genetically encoded light-activatible probe for 
toggling Ras activity in living cells (Fig. 1B). The 
optoSOS system relies on the light-dependent 
dimerization of PhytochromeB (PhyB) and 
phytochrome-interacting factor 6 (PIF), which 
associate when exposed to red (650 nm) light 
and dissociate when exposed to far-red (750 nm) 
light. PhyB was tethered to the membrane, and 
PIF, fused to the Ras-activating Son of Sevenless 
Homolog 2 (SOS2) catalytic domain (SOS2cat), 
was expressed in the cytoplasm. Therefore, light 
could be used to reversibly recruit SOS2cat to the 
membrane and thus dynamically modulate Ras 
activity. We tracked signal transmission from 
Ras to Erk through live-cell microscopy by co- 
expressing a blue fluorescent protein (BFP)- 
Erk2 reporter, which accumulates in the nucleus 
upon activation (Fig. 1C) (27). For more high- 
throughput and long-term analysis, we developed 
the optoPlate, a device for optogenetic illumina- 
tion in microwell plates (Fig. 1C and fig. S1). This 
device allowed us to stimulate cells dynamically 
across a large parameter space and analyze mul- 
tiple cellular outputs over time through fixed-cell 
fluorescence microscopy. 


Identification of cancer cell line 
with substantially altered Ras-Erk 
signaling dynamics 


One of the five lung cancer cell lines, H1395, had 
altered dynamic signal transmission properties 
(analysis of all the cancer cell lines is provided in 
fig. S2, A and B). When H1395 cells were sub- 
jected to various pulsatile optoSOS activation pat- 
terns, Erk activity (BFP-Erk2 nuclear localization) 
responded sluggishly whenever we switched 
optoSOS on or off (Fig. 2A). In particular, Erk 
activity took longer to diminish after optoSOS 
was switched off: The deactivation half-life 
(4/2) of Erk in H1395 cells was ~20-fold longer 
than that observed in normal NIH 3T3 control 
cells (H1395 ty. = 21 min; NIH 3T3 ty = 
1 min). We confirmed these slow dynamics by 
means of Western blot for ppErk (Fig. 2B and 
fig. S3A). Thus, in H1395 cells, instead of switch- 
ing off immediately after Ras input stops, Erk 
continues to signal. 

The slow responsiveness and decay of Erk 
activity may degrade the H1395 cell’s ability to 
resolve distinct high-frequency dynamic patterns 
of Ras stimulation. We applied optoSOS pulse 
trains that resemble naturally observed pathway 
dynamics (2-4) to normal lung epithelial cells 
(I6HBE) and H1395 cancer cells. Both cell types 
resolved input pulses spaced far apart (at 40-min 
intervals) (Fig. 2C and fig. S3B). However, as the 
input pulses were spaced progressively closer, 
the H1395 cells failed to distinguish the higher- 
frequency pulses. Ultimately, the H1395 cells failed 
to perceive gaps in the signal and produced a 
constant Erk response (Fig. 2C, 5’ OFF condi- 
tion). Thus, compared with normal cells, H1395 
cells have an impaired ability to perceive dy- 
namic pathway input (Fig. 2D). 
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B-Raf P-loop mutation (G469A) slows 
kinetics of signal decay 

The H1395 cell line harbors a B-Raf G469A mu- 
tation. We thought that the slow OFF-kinetics 
of Erk could be caused by either defects in 
switching off Erk or Mek (such as defects in 
downstream phosphatase function) or by defects 
in switching off the mutant B-Raf. To find nodes 
in the pathway that may cause the slow dynamics, 
we repeated our optoSOS stimulation studies, 
but when switching off optoSOS, we also added 
inhibitors to block particular steps in the pathway— 
either the Mek inhibitor U0126 (which rapidly 
shuts off signal flow from Mek to Erk) or mutant 
B-Raf inhibitor PLX-8394 (which rapidly shuts 
off signal flow from mutant B-Raf to Mek) (Fig. 3A 
and fig. S4A) (22). When either of these inhibitors 
was added concurrently with optoSOS inactivation 
in H1395 cells, active phospho-Erk (ppErk) de- 
cayed rapidly (Fig. 3A and fig. S4B), suggesting 
that normal Mek and Erk dephosphorylation 
activity was intact and that the source of extended 
signal decay lay upstream. Together, these experi- 
ments indicated that slow ppErk OFF-kinetics 
might emanate from mutant B-Raf. 

To further test whether the mutant B-Raf ac- 
counted for the slow OFF-kinetics of the pathway, 
we performed optogenetic profiling in which we 
linked the light-induced input to a different node. 
We used an optogenetic tool called optoBRaf. 
OptoBRaf is activated through inducible mem- 
brane recruitment of PIF fused to wild-type B-Raf, 
which stimulates its signaling to endogenous Mek 
(Fig. 3A, bottom) (23). OptoBRaf enabled us to 
stimulate the H1395 cells in a manner that by- 
passed the B-Raf G469A mutant. We observed 
rapid ppErk deactivation kinetics with optoBRaf 
stimulation, which was again consistent with a 
model in which the B-Raf G469A mutant is di- 
rectly responsible for the altered pathway dynam- 
ics (Fig. 3A, bottom, and fig. S4C). This experiment 
produced a rebound in ppErk signal after the 
initial rapid decay. This may be caused by relief of 
negative feedback of ppErk onto mutant B-Raf 
(24, 25) because repeating this experiment in the 
presence of PLX-8394 eliminated this rebound. 
We also observed extension of Ras-Erk kinetics 
in Beas2B normal lung epithelial cells (Jacking 
endogenous mutations in Ras, Raf, Mek, or Erk) 
engineered to express exogenous B-Raf G469A, 
and this extension was reversed in the presence 
of the B-Raf inhibitor PLX-8394 (fig. S4, D, E 
and F). Together, these results implicate the mu- 
tant B-Raf G469A as a kinetics-altering node in 
H1395 cancer cells. 

Lagging pathway kinetics of B-Raf G469A 
mutants were specific to mutation position be- 
cause another member of our cancer cell panel— 
HCC364—carried a B-Raf V600E mutation and 
showed wild-type, fast pathway kinetics (fig. S2B). 
G469 lies in the P-loop of B-Raf, which normally 
associates with the activation loop to maintain 
the B-Raf kinase domain in an inactive, auto- 
inhibited conformation (26). Normal activation 
of B-Raf requires release of this autoinhibition, 
which both frees the kinase domain and promotes 
the activating homo- or heterodimerization of 
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Fig. 1. Probing dynamic signal transduction and filtering in cancer cells. (A) Environmental 
stimuli can induce different dynamic patterns of Erk activity, which are then interpreted by 


downstream transcriptional circuits to specify cell b 
for Ras activation that enables probing of how cells 


ehavior. (B) OptoSOS is an optogenetic method 
filter and respond to dynamic Ras inputs. The 


light-inducible PhyB-PIF heterodimer drives membrane recruitment of the Ras-activating SOS2 
catalytic domain, which activates Ras at the membrane. Red light (650 nm) induces PhyB-PIF 
dimerization, whereas far-red light (750 nm) dissociates the dimer. (C) We tested the hypothesis 


that some cancer cells may inappropriately filter dy 


namic Ras-Erk signals. We examined how 


dynamic optogenetic inputs were interpreted by normal or cancer cells through a combination of 
live-cell microscopy, high-throughput optogenetic stimulation (fig. S1), and immunofluorescence. 


B-Raf. Oncogenic P-loop mutations both disrupt 
the inactive conformation and enhance dimeriza- 
tion of B-Raf (26-28). Several P-loop mutations 
impair B-Raf activity yet can also be oncogenic, 
likely because they enhance C-Raf transactiva- 
tion through B-Raf-C-Raf dimerization (26). 
Enhanced B-Raf dimerization induced by the 
P-loop G469A mutation might also cause the 
delayed OFF-kinetics. If so, delayed OFF-kinetics 
should be reversed in the presence of mutations 
that disrupt B-Raf dimerization, such as the 
R509H mutation (29, 30). Indeed, Beas2B cells 
transiently transfected with BRAF G469A showed 
elongated ppErk decay kinetics, whereas cells 
transfected with the BRAF G469A/R509H double 
mutant showed wild-type kinetics (fig. S5). 
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Other B-Raf P-loop mutant lines would also 
be expected to show extended ppErk inactivation 
kinetics. Indeed, we searched for additional lung 
cancer cell lines driven by B-Raf P-loop muta- 
tions and found that they also showed slow ppErk 
response to input Ras pulses (fig. S6A). Specif- 
ically, the Cal12T and H1666 lung cancer cell lines, 
each expressing endogenous B-Raf G466V (a dis- 
tinct oncogenic P-loop mutation from G469A), 
showed a similar sluggish response. As in H1395 
cells, this slowed response was reversed in the 
presence of the Raf inhibitor PLX-8394. Unlike 
the activating G469A mutation, G466V decreases 
catalytic activity (26). Nevertheless, both mutants 
extended Ras-induced ppErk kinetics, which may 
reflect their shared propensity for enhanced 
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dimerization with C-Raf. Although we readily 
found examples of altered kinetics in BRAF- 
mutant cells, a screen of cell lines with diverse 
Ras mutants revealed no similar examples (fig. S7, 
Aand B), suggesting that B-Raf is a more sensitive 
point for altering ppErk signaling dynamics. 


Paradox-activating drugs that 
perturb B-Raf dimerization also 
alter Ras-Erk kinetics 


Drugs that enhance B-Raf homo- or hetero- 
dimerization would also be expected to yield 
extended Erk inactivation kinetics. We therefore 
examined ppErk kinetics in the presence of so- 
called paradox-activating B-Raf inhibitors. Al- 
though designed to inhibit mutant B-Raf activity, 
this family of drugs paradoxically activates Raf- 
Mek-Erk signaling by enhancing B-Raf dimeriza- 
tion with C-Raf (29, 31, 32). This paradoxical 
pathway activation can actually stimulate cancer 
formation in certain patients receiving these 
drugs (33-36). We found that both vemurafenib 
and SB590885—two B-Raf inhibitors in this class— 
extended the otherwise fast Erk kinetics in both 
wild-type fibroblasts (Fig. 3B) and Beas2B lung 
epithelial cells (fig S6B). Although mechanistically 
similar, vemurafenib and SB590885 are chem- 
ically distinct and had differing dose-dependent 
effects on ppErk signal kinetics (fig. S8). By con- 
trast, the B-Raf inhibitor PLX-8394, which does 
not enhance Raf dimerization, had no effect on 
Erk kinetics. As with G469A mutant-extended 
kinetics, drug-extended kinetics were reversed 
with Mek inhibition, indicating that the drugs 
extended kinetics through a mechanism up- 
stream of Mek activation (fig. S9). 

Increasing Raf dimerization with B-Raf in- 
hibitors can enhance active Ras nanocluster for- 
mation, resulting in increased gain between Ras 
and Erk but no change in the dynamics of nano- 
cluster formation (37, 38). Our results confirm 
increased gain from Ras to Erk (fig. S8) and are 
thus consistent with this mechanism. Further, 
because slow ppErk kinetics emerge despite fast 
Ras nanocluster decay dynamics (38), we con- 
clude that the sustained ppErk signal originates 
downstream of Ras activation and cluster forma- 
tion at the level of Raf activation. In total, these 
data support a model in which B-Raf P-loop 
mutations that enhance homo- or heterodimer- 
ization cause a lag in the dynamics of shutting 
off overall Raf activity (Fig. 3C). 


Modeling how slow Ras-Erk dynamics 
could alter cell decisions 


We examined how such altered Ras-Erk transmis- 
sion properties could affect downstream cellular 
decision-making. Changes in the dynamic response 
of the Ras-Erk pathway fundamentally change 
how the cell filters dynamic inputs. The wild- 
type Ras-Erk pathway filters signals shorter than 
~4 min (the pathway loses ability to transmit more 
transient changes) while faithfully transmitting 
longer ones ranging from minutes to hours (/6). 
To examine the consequences of changing these 
filtering parameters, we constructed a simple 
model that integrates a low-pass filter with 
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Fig. 2. B-Raf mutant H1395 cells have an impaired transmission of pulsatile Ras signals. 

(A) H1395 cells (bottom) showed extended kinetics of activation and inactivation in response to 
defined Ras input pulses. By contrast, NIH 3T3 cells (top) and the other cells in our cell line panel 
exhibited rapid kinetics (fig. S2B). Traces represent quantitation from live-cell imaging of nuclear 
BFP-Erk2 reporter accumulation. Traces were normalized between O and 1 and represent the 

mean +1 SD of 15 and 14 cells for 3T3 and H1395 cells, respectively. (B) Inactivation kinetics for 
H1395 and NIH 3T3 cells were confirmed through Western blot (blots are available in fig. S3A). 
Western blot quantification of ppErk is shown and fitted to single exponential decay. The dashed blue 
line depicts basal amount of ppErk from unstimulated cells. (©) Loss of fidelity in dynamic signal 
transduction in H1395 cells was observed through live-cell microscopy. 16-HBE (normal) and H1395 
(cancer) cells were subjected to various dynamic patterns of input signal. (Three input conditions 
are shown. All six input conditions are shown in fig. S3B). As optoSOS input frequency increased, the 
H1395 cancer cells progressively lost their response to the gaps in the signal, whereas the normal 
cells did not. Traces represent the mean of five cells. Individual traces can be seen in fig. S3B. 

(D) Changes in the cell's signal perception are analogous to cellular “blurred vision” for external 
stimuli. (Single-letter abbreviations for the amino acid residues are as follows: A, Ala; G, Gly; and 

V, Val. In the mutants, other amino acids were substituted at certain locations; for example, G469A 
indicates that glycine at position 469 is replaced by alanine.) 


downstream transcription and resultant cell fate 
commitment (fig. S10A). In this model, the cell 
senses whether the intensity of a ppErk signal 
rises above an activation threshold, above which 
ppErk-dependent transcription begins (fig. SIOB). 

We compared this system with fast (normal) 
and slow system responses. In the model, slow- 
ing the system response can sustain otherwise 
transient signal activity in response to dynamic 
inputs (fig. S1OB). In particular, pulse trains of 
short pulses were poorly resolved and inter- 
preted as a stronger, more continuous input. 
Over a sufficiently long integration time, cellular 
decisions for downstream outputs such as gene 
expression and proliferation could differ sub- 
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stantially between the fast and slow pathway 
models (fig. S10, C and D). 

Because oncogene- and drug-extended kinetics 
are often accompanied by increased basal signal- 
ing (figs. S4, E to F; S6B; S8; and S9), we examined 
the effects of increased basal signaling by changing 
the activation threshold in our model. Increased 
basal signaling is equivalent to a lower activation 
threshold. In the model, although increased basal 
signaling (lower threshold) minimally sensitized 
cells to proliferate under fast ppErk kinetics, slow 
ppErk kinetics dramatically increased the pro- 
liferative response (fig. S1OD). Thus, although 
increased signaling and extended kinetics may 
synergize to control cellular response, our model 
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Fig. 3. B-Raf P-loop mutations and drugs that perturb Raf dimerization 
both extend Ras-Erk pathway kinetics. (A) OptoSOS and optoBRaf coupled 
with MEK inhibition (U0126) and mutant-B-Raf inhibition (PLX-8394) were 
used to isolate B-Raf as a network node that can extend Erk kinetics. Plots 
show quantification of Western blot data (blots are available in fig. S4, A, B, and 
C). Normal wild-type decay is indicated with the gray dashed line; basal 
signaling level (no opto-stimulation) is indicated by the purple dotted line. 


predicts that kinetics can have a dominant role 
in downstream cellular behavior. 


Slow pathway dynamics alter 
transcriptional responses 


To experimentally test whether changes in signal 
transmission dynamics could alter gene expres- 
sion decisions in cells, we measured the amounts 
of several downstream output proteins in response 
to optoSOS inputs. We sought an experimental 
model in which we could isolate the effects of 
altered Ras-Erk kinetics in a well-controlled cell 
line that lacks potentially confounding mutations. 
Thus, we compared the responses of wild-type 
NIH 3T3 cells in the presence and absence of 
100 nM SB590885 (paradox inhibitor of B-Raf). 
This concentration of drug extended Ras-Erk path- 
way decay kinetics and minimally increased basal 
ppErk levels (figs. S8, A and B, and SI1A). Cells 
were seeded and serum-starved in 384-well plates 
and, in the presence or absence of drug, exposed to 
various dynamic input patterns with the optoPlate 
(Fig. 4A and fig. SIIB). After stimulating the cells 
over several hours, cells were fixed and immuno- 
stained for Erk-dependent transcriptional targets. 

We measured the expression of two immediate 
early gene targets, cJun and early growth re- 
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sponse protein 1 (EGR1), and the cell-cycle reg- 
ulator Cyclin D1 (Fig. 4A). These are targets that 
show strong dependence on the dynamics of 
input signals. In normal NIH 3T3s, cJun and 
Cyclin D1 expression are strongly induced with 
continuous optoSOS stimulation and are not 
induced by pulsatile, transient stimulation. By 
contrast, EGR1 shows a peak of expression with 
continuous stimulation but then shows an adapt- 
ive decrease in expression, potentially mediated 
by negative feedback (39). EGR1 expression stayed 
higher with pulsed stimulation (30 min on, 30 min 
off), probably because such pulsatile stimulation 
prevents the accumulation of maximal negative 
feedback (Fig. 4B and fig. S11, C to H). 

These dynamically responsive Ras-Erk gene 
targets showed changes in regulation with altered 
Ras-Erk signal transmission. In all cases, when 
we performed dynamic stimulation studies in the 
presence of the drug SB590885, the pulsed-input 
response shifted to more closely resemble that 
of the normal constant-input response. Upon 
SB590885 addition, both cJun and Cyclin D1 ac- 
cumulated in response to normally subthreshold 
pulsed stimulation. Conversely, pulsed stimula- 
tion in the presence of SB590885 yielded the 
adaptive response of EGR1 normally observed 
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(B) Treatment of NIH 3T3s with paradoxically activating B-Raf inhibitors 
vemurafenib and SB590885 also extended ppErk decay kinetics. Datapoints 
show means + 95% confidence interval (Cl) of mean single-cell ppErk 
immunofluoresecnce from three replicates. (C) Our data support a model in 
which P-loop B-Raf mutations or paradoxically activating drugs can both 
enhance the Ras-induced dimerization potential of B-Raf and C-Raf, thus 
altering the kinetic properties of pathway activation and inactivation. 


with a constant Ras input (Fig. 4B and fig. S11, 
C to H). Together, these results show that altering 
Ras-Erk transmission kinetics can change how 
cells filter dynamic signals, altering expression 
patterns of the genes that control important cell 
decisions. 


Slow pathway dynamics shift thresholds 
for inducing proliferation 


Because the Ras-Erk pathway is a key driver of 
proliferation, and because we observed differen- 
tial expression of the cell-cycle regulator Cyclin 
D1, we tested whether slowed Ras-Erk trans- 
mission kinetics might corrupt proper control of 
proliferation by driving cell-cycle entry in response 
to what are normally nonproliferative dynamic 
input patterns. Dynamic Erk signals are linked 
to cell-cycle control in vivo, but the physiological 
parameters of Erk dynamics are not well defined 
(2-4, 40). We therefore tested how cells responded 
to a range of dynamic Ras inputs, both in the 
presence and absence of SB590885-induced delay 
in ppErk kinetics. We examined a set of signal 
patterns that, after 19 hours of stimulation, could 
drive cell-cycle entry, as assayed by means of DNA 
incorporation of 5-ethynyl-2-deoxyuridine (Edu) 
during S-phase (Fig. 4C and fig. $12, A, B, and C). 
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Cells in each microwell received either no signal, 
a constant signal, or a periodic signal. The time 
in the ON phase (ON interval) and OFF phase 
(OFF interval) of the periodic signal was system- 
atically varied between microwells. 

The resulting heatmap of proliferation as a 
function of the dynamic stimulation pattern is 
shown in Fig. 4C (no-signal and constant-signal 
conditions are depicted in fig. S12D; heatmaps 
are shown in Fig. 4C and fig S12E). In normal 
3T3s, we delineated a “growth regime” of input 
signals that drive the strongest proliferation. This 
showed that Ras-Erk-induced cell-cycle entry 
required strong and sustained signals: those 
largely consisting of long ON intervals separated 
by short OFF intervals. Slowing Ras-Erk trans- 
mission kinetics with drug dramatically expanded 
this growth regime, increasing proliferation across 
a range of otherwise nonproliferative Ras input 
patterns (Fig. 4C and figs. S12F and S13). 


Conclusion: Mutants that alter how 
a cell perceives signals can contribute 
to disease 


Some cancer mutations or targeted drugs can 
alter a cell’s dynamic signal transmission and 
filtering properties, and such changes can re- 
shape how a cell perceives or misperceives its 
environment, potentially contributing to disease 
phenotypes (Fig. 4D). Such signal misinterpreta- 
tion might contribute to hyperproliferation, but 
Ras-Erk signaling functions in many cell behaviors, 
including cell survival and migration, and thus 
defective signal transduction may plausibly affect 
these behaviors in disease as well. Functional pro- 
filing of intact signaling networks with opto- 
genetics provides a powerful method with which 
to uncover and understand such alterations in 
cellular decision-making (41). We anticipate that 
profiling more cancers and more pathways in 
this manner may uncover other types of dynamic 
signaling phenotypes that could contribute to 
disease. The improved understanding we gain 
of how mutated signaling networks differentially 
process information may help us mechanistically 
understand cancer, autoimmunity, and other dis- 
eases that involve corrupted cellular decision- 
making and may provide new dynamically 
optimized strategies for targeting these diseases 
(42-44). 


Materials and Methods 
Plasmid constructs, viral packaging, 
and transduction 


OptoSOS components Phy-mCh-CAAX (Addgene 
#50839), YFP-PIF-SOS2cat (Addgene # 50851), 
and BFP-Erk2 (Addgene #50848) were described 
previously (16, 45). Phy-mCh-CAAX used the 
KRas4B-derived CAAX sequence KMSKDGKKK- 
KKKAKTKCVIM, which is expected to be farne- 
sylated. This sequence differs from the wild-type 
KRas4B CAAX only at the underlined alanine, 
which represents an S>A mutation. This muta- 
tion was made to prevent endogenous regula- 
tion at this residue, as previously reported (46). 
PAmCh-BRAF(G469A) was created by site-directed 
mutagenesis of PAmCh-BRAF, a generous gift 
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Fig. 4. Perturbation of Ras-Erk signaling dynamics can alter how cells make proliferative 
decisions. (A) Transcriptional decoding of dynamic signal inputs was examined in normal NIH 3T3 
cells (fast pathway kinetics) or in cells treated with the kinetics-altering drug SB590885. Cells were 
stimulated with fixed-width signal pulses separated by various intervals. Expression of Erk targets 
and downstream cell-cycle entry were examined. (B) Altered Ras-Erk kinetics changed transcrip- 
tional output to dynamic Ras inputs. Immunofluorescence of cJun, EGR1, and Cyclin D1 expression 
time courses is depicted. Only expression in response to constant stimulus or a representative 
pulsed stimulus is shown.All input conditions tested are provided in fig. S11. Illumination was 
achieved with the optoPlate, and protein expression was assessed through single-cell immuno- 
fluorescence coupled with high-content imaging. Data points represent the median target 
fluorescence from 3000 to 4000 cells for each condition. (C) Extended Ras-Erk kinetics sensitized 
cells to proliferate under nonproliferative conditions. We used 384-well optoPlate illumination to 
examine proliferation of cells in response to a systematic scan of dynamic inputs. Normal and 
drug-treated cells were exposed to all combinations of six optoSOS pulse lengths (ON interval) and 
separated by seven pulse interval lengths (OFF interval) over 19 hours. Cells were then incubated 
with Edu for 30 min, fixed, stained, imaged, and analyzed. The percentage of cells incorporating 
Edu was plotted as an interpolated heatmap. Further analysis is available in figs. S12 and S13. 

The values used to generate the map represent means of biological quadruplicates. (D) Our data 
support the model that altering dynamic signal filtering properties can reshape the input-response 
map and may drive improper cellular behavior, such as hyperproliferation. 


from Eric Collison (UCSF), using the QuikChange 
Lightning kit (Agilent). For optoBRAF, a YFP-PIF- 
BRAF was created by PCR amplification of the YFP- 
PIF-SOS2cat vector backbone to exclude SOS2cat, 
PCR amplification of wtBRAF, and ligation using 
the In-Fusion enzyme cocktail (Clontech). Lenti- 
virus was packaged by cotransfecting the transfer 
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vector, pCMVdR8.91, and pMD2.G (Addgene 
# 12259) into Lenti-X 293T cells (Clontech) using 
the Fugene 6 HD transfection reagent. 48 hours 
after transfection, viral supernatant was harvested, 
sterile filtered through a 0.45 «1m filter, and added 
to cells for infection. Unused supernatant was 
stored at -80°C. 72 hours after infection, transduced 
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cells were sorted for expression using a FACS 
Aria Fusion (BD Biosciences). 


Cell lines, cell culture, and inhibitors 


All cell lines were maintained in standard tissue 
culture incubators at 37° C and 5% CO.. NIH 
3T3s (ATCC) were cultured in DMEM supple- 
mented with 10% calf serum (HyClone) and 1% 
penicillin/streptomycin/glutamine (ThermoFisher 
# 10378016). Lenti-X 293T cells were cultured in 
DMEM High Glucose H-21 (UCSF Cell Culture 
Facility) supplemented with 10% FBS (UCSF Cell 
Culture Facility) and 1% penicillin/streptomycin 
(UCSF Cell Culture Facility). All other lines were 
cultured in RPMI (UCSF Cell Culture Facility) 
supplemented with 10% FBS and 1% penicillin/ 
streptomycin/glutamine. The Mek inhibitors U0126 
(Selleckchem #81102) and trametinib (Selleckchem 
#S2673) and B-Raf inhibitors vemurafenib (Sell- 
eckchem #S1267) and SB590885 (Selleckchem 
#52220) were obtained from Selleckchem. PLX- 
8394: was obtained as a gift from Plexxicon. 


Optogenetic stimulation 


For all optogenetic experiments, cells were sup- 
plemented with HPLC-purified phycocyanobilin 
(PCB, Frontier Scientific #P14137) at a concentra- 
tion of 5 uM (3T3s) or 10 uM (all other cells). 
Cells were incubated in PCB for ~0.5-1 hour be- 
fore optogenetic stimulation. For bulk Western 
blot experiments, cells were illuminated in a cell 
culture incubator with a custom built panel of 
either 650 nm or 750 nm LEDs for activation or 
inactivation of optoSOS, respectively. For 96- and 
384-well In-Cell Western and immunofluorescence 
assays, optogenetic experiments were performed 
with a custom-built 96-well “optoPlate” illuminator 
with adapters accommodating either 96- or 384- 
well plates (see fig. S1). Briefly, a printed circuit 
board was designed using the Kicad software 
package and manufactured through PCBUnli- 
mited (PCBUnlimited.com). The circuit board 
design allowed placement of 192 independently 
addressable LEDs, with two LEDs—one red (Vishay, 
VLMK31R1S2-GS18), one far-red (Marubeni, 
SMT780)—fitting under each well position. The 
LEDs shared a common anode, and each cathode 
was connected to one of 12 24-channel constant- 
current LED drivers (TLC5947, Texas Instru- 
ments). These drivers allow independent 12-bit 
grayscale control (0-4095) of each LED using 
pulse-width modulation. LED drivers were con- 
trolled by an on-board Arduino Micro micro- 
controller, which was programmed with custom 
script through the Arduino IDE. Custom adapt- 
ers interfacing with 96- and 384-well plates were 
designed in the Autodesk Inventor program and 
printed on a Stratasys uPrint 3D printer. 


Western blot 


Cells were seeded in 6-well plates at a density of 
1x10° cells per well. After 24h, cells were starved 
in starvation medium (DMEM or RPMI media 
supplemented with 1% penicillin/streptomycin/ 
glutamine, and 20 mM HEPES). Cells were lysed 
in ice-cold RIPA buffer supplemented with 
protease (cOmplete, Sigma #4693159001) and 
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phosphatase inhibitors (PhosSTOP, Sigma 
#4906845001). After a 10 min centrifugation at 
4°C, supernatants were supplemented with 5X 
Laemmli’s sample buffer and were boiled for 
10 min. SDS PAGE was performed in with in 
NuPAGE Bis-Tris gels (Invitrogen) using MES 
buffer (ThermoFisher #+NP0002), and blots were 
transferred onto nitrocellulose membranes using 
the BioRad Trans-Blot Semi-Dry Transfer Cell. 
Transferred blots were then blocked with Odyssey 
blocking buffer (LI-COR #927-4000) and antibody 
stained using the Freedom Rocker liquid handling 
system. Western blots were imaged on a LI-COR 
Odyssey imager, and images were quantified 
using ImageJ. Phospho-Erk antibody was obtained 
from Cell Signaling Technologies (#4370), alpha- 
tubulin antibody was obtained from Santa Cruz 
Biotechnology (Santa Cruz, #23948, 1:1000), and 
IRDye conjugated secondary antibodies (#926-3221, 
#926-68020) were obtained from LI-COR. 


96- and 384-well 
optogenetic experiments 
Cell seeding, starvation, and illumination 


96- or 384-well plates (Greiner #655087 and 
#781092) were coated in fibronectin (Millipore, 
#FCO010,1:50 dilution in PBS) for 30 min in the 
incubator. Cells were seeded at 5000 or 1000 cells 
per well for 96- or 384-well experiments, respec- 
tively, and were spun down in the plate for 1 min 
at 100 x g immediately after seeding to obtain an 
even spatial distribution of cells. After 24h, cells 
were starved with starvation medium (basal medi- 
um with 1% penicillin/streptomycin/glutamine 
and 20 mM HEPES). To balance effective starva- 
tion while minimizing loss of cells, cells in 96- 
well experiments underwent one full medium 
replacement followed by 4X 70% replacements, 
performed manually. Cells in 384-well experi- 
ments underwent 7X 70% starvation washes 
with the Biomek FX liquid handling robot. After 
starvation for 24 hours (signaling experiments) 
or 36 hours (growth experiments), cells were sup- 
plemented with phycocyanobilin (PCB) by mixing 
a 2X PCB solution in starvation media, removing 
starvation media from the plate manually, and 
adding the appropriate amount of 2X PCB. Cells in 
PCB were then incubated in the dark for 30 min. 
Any additional drugs were added at the same 
time and in the same solution as the PCB, with 
the exception of Mek-i addition in the experi- 
ment described in fig. S9. All manipulations with 
cells in PCB were done under dim light settings, 
and cell-containing plates were covered with alumi- 
num foil whenever possible to prevent unin- 
tended photoactivation. The plates were then 
placed onto a pre-programmed optoPlate device 
and exposed to the desired illumination profiles. 


Cell fixation, immunostaining, 
and antibodies 


Upon completion of the experiment, cells were 
immediately supplemented with 16% PFA to a 
final concentration of 4%PFA. After fixing for 
10 min, the PFA-containing medium was man- 
ually aspirated with a multichannel pipette and 
cells were permeabilized with 0.5% Triton X-100 
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(Sigma) for 10 min followed by ice-cold 100% 
methanol at -20°C for 10 min. Cells were then 
blocked for 30 min at room temperature with 
Odyssey Blocking Buffer (LI-COR). Primary anti- 
bodies were then diluted in fresh blocking buffer: 
anti-ppErk (CST #4370, 1:200), anti-ppErk(CST # 
4344, 1:50), anti-cJun (CST #9165, 1:100), anti- 
EGRI1 (CST #4154, 1:800), anti-CyclinD1 (Abcam 
#ab134175, 1:100). Blocking buffer was removed 
and cells were incubated in primary antibody 
solutions overnight. Cells were washed 5X with 
PBS with 0.1% Tween-20 (Sigma). All washes 
were performed with a BioTek EL406 liquid 
handler. Cells were then incubated in secondary 
antibody solutions. For In-Cell Western experi- 
ments, IRDye 800CW-conjugated goat anti-Rabbit 
(Licor, #926-32211, 1:800) secondary was used, 
and CellTag700 (Licor, #926-41090, 1:2000) was 
used for normalization. For single cell immuno- 
fluorescence, Alexa-488 and Alexa-647 conjugated 
goat anti-Rabbit secondary antibodies (Jackson 
Immunoresearch, #111-545-003 and #111-605-003, 
1:100) were used in conjunction with DAPI (Mo- 
lecular Probes, #D1306, 300 nM) for nuclear label- 
ing. After 1 hour of secondary antibody incubation, 
cells were washed 5X in PBS + 0.1% Tween-20. 


Imaging 
In-Cell Western: For In-Cell Western experiments, 
plates were imaged on the LI-COR Odyssey scan- 
ner. Intensity measurements for each well were 
exported using the integrated In-Cell Western anal- 
ysis software and were further analyzed in R. 
High content imaging: Single cell immuno- 
fluorescence and Edu labeling (below) was mea- 
sured on the ThermoFisher Scientific ArrayScan 
XTI High Content Platform imager, and image 
quantitation was conducted through the inte- 
grated HCS Studio software. Briefly, cells were 
identified through segmentation of DAPI-stained 
nuclei, and parameters were specified to ensure 
proper segmentation of single cells. Mean nu- 
clear intensities were then calculated for each 
cell for the fluorescence channels reporting on all 
targets except for ppErk. For ppErk, a 2-5 pixel- 
wide ring was drawn around the nucleus, and 
mean fluorescence intensity in this ring was rec- 
orded. The fluorescence measurements were then 
exported, and further analysis was conducted in R. 


Edu proliferation assay 


S-phase entry was assessed through Edu incor- 
poration using the Click-iT Edu AlexaFluor 555 
Imaging kit (ThermoFisher, #C10338). Cells were 
seeded and starved as described above in the 
“96- and 384-well optogenetics experiments” 
section. After completion of the desired illu- 
mination time, cells were supplemented with 
a 30 min Edu pulse (5 uM) for 30 additional 
minutes of illumination. Cells were then fixed 
and permeabilized as described, and Edu was 
conjugated to AlexaFluor 555 as per manufac- 
turer instructions. 


Live cell microscopy 


Cells were seeded in 384: well glass bottom plates 
(Matrical, Inc. #MGBI101-1-2-LG) that were pre-coated 
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with 50 uL of 20 ug/mL fibronectin (Millipore + 
FCO010) for 1 hour. Upon seeding, cells were spun 
down at 100 x g for 1 min to promote an even 
distribution of cells on the well bottom. The fol- 
lowing day, cells were starved with starvation 
medium (defined above). 1 hour before imaging, 
starvation medium was replaced with fresh star- 
vation medium containing 5 uM PCB. 

Confocal imaging was conducted on a Nikon 
Eclipse TI inverted microscope with a Yokagawa 
CSU-X1 spinning disk confocal unit, a 20x PlanApo 
TIRF 1.49 NA objective, and an EM-CCD camera 
(Andor). Environmental control was maintained 
with a humidified environmental chamber at 37°C 
and 5% CO, (In Vivo Scientific). BFP, YFP, and 
mCherry were imaged with 405 nm, 488 nm, and 
561 nm lasers (LMM5, Spectral Applier Research), 
respectively. 

Cells were exposed to 650 nm and 750 nm 
light for optogenetic control as described previ- 
ously (16). Briefly, a 650 nm LED was mounted 
into the epifluorescence illumination port, and 
its light intensity (voltage control) was regulated 
with custom Matlab scripts (45) controlling the 
analog output of a DT9812 board (Data Transla- 
tion), which was connected to the LED. 750 nm 
light was applied by filtering bright-field light 
through a 750 nm longpass filter (FSQ-RG9, 
Newport) and controlling its timing through 
software control of the diascopic shutter. 


Image analysis 
BFP-Erk2 responses to variable 
Ras input pulses 


For visualizing BFP-Erk2 responses to dynamic 
Ras activation, live cell imaging was analyzed 
with a combination of ImageJ and custom R 
scripts. Nuclear accumulation of BFP-Erk2 was 
measured by mean fluorescence intensity of an 
ROI within the cell nucleus in the BFP channel, 
and YFP-PIF-SOS2cat membrane translocation 
was measured by cytoplasmic depletion of YFP, 
as described previously (16). BFP and YFP traces 
for individual cells were then corrected for 
fluorescence drift. Traces underwent a linear 
transform by calculating a linear regression 
of all points, and then subtracting this fit from 
the original trace. Photobleaching was corrected 
by fitting an exponential decay envelope to each 
trace and dividing each trace by its envelope 
function. Traces were then normalized between 
0 and 1, and the YFP-SOS2cat trace was inverted 
for visual clarity. 


Edu and immunofluorescence analysis 


For quantifying the fraction of Edu+ cells in 
a well, density plots of Edu intensities of 
cells in each well were constructed. The Edu 
distribution was bimodal, with a tight peak 
for Edu- cells and a broader peak for Edu+ 
cells. A custom segmentation algorithm classified 
the Edu- and Edu+ populations for each trace. 
Example peaks and segmentation are shown 
in Fig. S12C. 

For single cell immunofluorescence, fluores- 
cence distributions were generated for each well 
and the distribution median was extracted. 
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Model and fitting 
Low pass filter of signal processing 
To model how cells filter and respond to signals, 
we constructed a 3-step model. The first step 
describes signal filtering, the second describes 
signal perception, and the third describes cell 
fate decisions resulting from that perception. 
To model low-pass signal filtering, we imple- 
mented a conceptually simple filter, a first-order 
RC circuit. This model consists of a voltage source 
V, resistor R, and capacitor C wired in series. In 
our example, these can be thought of as the input 
signal, signal transduction through the pathway, 
and the ability for the cell to hold that signal (e.g., 
the total abundance of a protein that can be 
activated), respectively. We were interested in 
measuring the voltage (signal) across the capac- 
itor (V. = g/C) as a function of dynamic inputs. 
Kirchoff’s voltage law states that the sum of the 
voltages across each component in this loop 
equals 0 


V-Va-Vo=0 


da(t) pat) _ 


Vo) dt Cc 


where C = 1, R was variably defined, and V was 
the dynamic model input. Dynamic inputs were 
achieved by changing the value of V between 0 
(OFF) and 1 (ON) at defined intervals. 

The time constants t = RC define the signal 
kinetics in this model, which describe the speed 
of exponential rise and decay of the signal. To 
model signal processing changes in cancer, we 
changed the value of R, thus changing the value 
of t. We started the simulation from a state of 
rest, where g(t = 0) = 0. We implemented the 
model in the R programming language (https:// 
cran.r-project.org) using the deSolve package. 

To model cellular perception of a filtered sig- 
nal, we defined a signal intensity at which down- 
stream transcriptional circuits are turned on. 
In fig. S10, B and C, we defined this threshold 
to be 30% based on estimates from previous 
studies (2, 47), but fig. SIOD shows that a broad 
range of threshold values gives qualitatively sim- 
ilar results. 

Finally, we assumed that a cell’s decision to 
proliferate was directly correlated to the cumu- 
lative signal the cell perceived by the end of the 
simulation. Many possible relationships exist 
between perceived signal and proliferation, but 
we chose a direct correlation due to its simplicity. 
We expect that a different relationship would 
also show that altered signal perception could 
lead to differential cell fate choices, though 
the set of inputs subject to misperception may 
change. 


Curve fitting 


Single exponential decay was fitted to normal- 
ized kinetics data using the glm function in R 
assuming a Gaussian error distribution and a 
“Jog” linkage. The curves describing proliferation 
as a function of duty cycle in Fig. S12F were fit to 
a Hill function. 
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Developmental barcoding of whole 
mouse via homing CRISPR 


Reza Kalhor*, Kian Kalhor, Leo Mejia, Kathleen Leeper, Amanda Graveline, 


Prashant Mali, George M. Church* 


INTRODUCTION: The remarkable develop- 
ment of a single cell, the zygote, into the full 
organism occurs through a complex series of 
division and differentiation events that resem- 
ble a tree, with the zygote at the base branch- 
ing through lineages that end in the terminal 
cell types at the top. Characterizing this tree of 
development has long been a subject of inter- 
est, and the combination of modern genome 
engineering and sequencing technologies prom- 
ises a powerful strategy in its service: in vivo 
barcoding. For in vivo barcoding, 
heritable random mutations are in- 

duced to accumulate during devel- 

opment and sequenced post hoc to 
reconstruct the lineage tree. Demon- 
strations thus far have largely focused 

on lower vertebrates and have used a 
barcoding element with a constrained 
window of activity for clonal tracing 

of individual cells or cell types. Im- 
plementation in mammalian model 
systems, such as the mouse, incurs 

unique challenges that require ma- 

jor enhancements. EO 


RATIONALE: To address the com- 
plexity of mammalian development, 
we reasoned that multiple indepen- 
dent in vivo barcoding elements could 
be deployed in parallel to exponen- 
tially expand their recording power. 
Independence requires both an ab- 
sence of cross-talk between the ele- 
ments and an absence of interference 
between their mutation outcomes. A 
system with the potential to deliver 
on these requirements is homing 
CRISPR, a modified version of ca- 
nonical CRISPR wherein the homing 
guide RNA (hgRNA) combines with 
CRISPR-Cas9 nuclease for repeated 
targeting of its own locus, leading to 
diverse mutational outcomes. There- 
fore, in mouse embryonic stem cells, 
we scattered multiple hgRNA loci 
with distinct spacers in the genome 
to serve as barcoding elements. With 
this arrangement, each hgRNA acts 
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Conception 


independently as a result of its unique spacer 
sequence, and undesirable deletion events be- 
tween multiple adjacent cut sites are less like- 
ly. Using these cells, we generated a chimeric 
mouse with 60 hgRNAs as the founder of the 
MARCI (Mouse for Actively Recording Cells 1) 
line that enables barcoding and recording of 
cell lineages. 


RESULTS: In the absence of Cas9, hgRNAs 
are stable and dormant; to initiate barcoding, 


| 
hgRNA mouse 


Cas9 mouse 
\ A) 


Primitive 
endoderm 


we crossed MARCI mice with Cas9 knock-in 
mice. In the resulting offspring, hgRNAs were 
activated, creating diverse mutations such that 
an estimated 10° distinct barcode combina- 
tions can be generated with only 10 hgRNAs. 
Furthermore, hgRNAs showed a range of ac- 
tivity profiles, with some mutating soon after 
conception while others exhibited lower activity 

through most of the ges- 
tation period. This range 
Read the full article resulted in sustained bar- 
at http://dx.doi. coding throughout ges- 
org/10.1126/ tation and recording of 
science.aat9804 developmental lineages: 
Gorseiinn teenies as Tach call seherte a ek 
of unique mutations that are passed on to 
its daughter cells, where further unique mu- 
tations can be added. Consequently, at any 
stage in such developmentally barcoded mice, 
closely related cells have a more similar mu- 
tation profile, or barcode, than the more dis- 
tant ones. These recordings remain embedded 
in the genomes of the cells and can be ex- 
tracted by sequencing. 

We used these recordings to carry out bottom- 
up reconstruction of the mouse lineage tree, 
starting with the first branches that 
emerged after the zygote, and ob- 
served robust reconstruction of the 
correct tree. We also investigated axis 
development in the brain by sequenc- 
ing barcodes from the left and right 
side of the forebrain, midbrain, and 
hindbrain regions. We found that bar- 
codes from the left and right sides of 
the same region were more closely 
related than those from different re- 
gions; this result suggests that in the 
precursor of the brain, commitment 
to the anterior-posterior axis is estab- 
lished prior to the lateral axis. 


~E10 


CONCLUSION: This system provides 
an enabling and versatile platform 
for in vivo barcoding and lineage 
tracing in a mammalian model sys- 
tem. It can straightforwardly create 
developmentally barcoded mice in 
which lineage information is pre- 
recorded in cell genomes. Combining 
multiple independently acting molec- 
ular recording devices greatly enhances 


— 


Trophectoderm 


Placenta 


) their capacity and allows for reliable 


Reconstructed lineages 


information recovery and reconstruc- 
tion of deep lineage trees. 


Developmental barcoding and lineage reconstruction in 
mice. Crossing the MARC1 mouse line, which carries multiple 
hgRNAs, with a CRISPR-Cas9 mouse line results in develop- 
mentally barcoded offspring that record lineages in their cells. 
These recordings were extracted and used to reconstruct 
lineage trees. A combination of the trees extracted from 
different developmentally barcoded mice is shown. ICM, inner 
cell mass; EO, embryonic day O. 
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Developmental barcoding of whole 
mouse via homing CRISPR 


Reza Kalhor’?*, Kian Kalhor*, Leo Mejia’, Kathleen Leeper”, Amanda Graveline”, 


Prashant Mali*, George M. Church”?* 


In vivo barcoding using nuclease-induced mutations is a powerful approach for recording 
biological information, including developmental lineages; however, its application in 
mammalian systems has been limited. We present in vivo barcoding in the mouse with 
multiple homing guide RNAs that each generate hundreds of mutant alleles and combine 
to produce an exponential diversity of barcodes. Activation upon conception and continued 
mutagenesis through gestation resulted in developmentally barcoded mice wherein 
information is recorded in lineage-specific mutations. We used these recordings for reliable 
post hoc reconstruction of the earliest lineages and investigation of axis development in 
the brain. Our results provide an enabling and versatile platform for in vivo barcoding and 
lineage tracing in a mammalian model system. 


n sexually reproducing multicellular eukary- 
otes, a single totipotent zygote remarkably 
develops into all cells of the full organism. 
This development occurs through a highly 
orchestrated series of differentiation events 
that take the zygote through many lineages as 
it divides to create all the different cell types 
(1). This path resembles a tree, with the zygote 
at the base of the trunk branching into stems 
of cell lineages that eventually end in the termi- 
nal cell types at the top of the tree (2, 3). The 
ability to map this tree of development will have 
a far-ranging impact on our understanding of 
disease-causing developmental aberrations, our 
capacity to restore normal function in damaged 
or diseased tissues, and our capability to generate 
substitute tissues and organs from stem cells. 
Tracing the lineage tree in non-eutelic higher 
eukaryotes with complex developmental path- 
ways remains challenging. Clonal analysis, which 
entails cellular labeling and tracking with a dis- 
tinguishable heritable marker, has been effective 
when evaluating a limited number of cells or 
lineages (4-7). Using more diverse presynthe- 
sized DNA sequences as markers, known as 
cellular barcoding, has allowed for analysis of 
larger cell numbers (7-9). What limits these 
approaches is the static nature of labeling that 
only allows analysis of a snapshot in time. Re- 
cent advances in genome engineering technol- 
ogies, however, have enabled in vivo barcode 
generation (JO, 17). In this approach, a locus is 
targeted for rearrangement or mutagenesis such 
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that a diverse set of outcomes is generated in 
different cells (12). As these barcodes can be gen- 
erated over a sustained period of time, they 
drastically expand the scope of cellular barcod- 
ing strategies, promising deep and precise lineage 
tracing, from the single-cell to the whole-organism 
level (Fig. 1A) (13-15), and recording of cellular 
signals over time (J6, 17). Multiple studies es- 
tablish proof of this principle in recording and 
lineage tracing, with demonstrations in cultured 
cells (J4-16, 18) and in lower vertebrates (13, 19-22). 
However, no demonstrations have yet been car- 
ried out in mice, a model organism more rele- 
vant to human health in many aspects such as 
development. The challenges associated with 
work in mice can account for this discrep- 
ancy. Gestation in mice takes place inside the 
mother’s womb, rendering genetic manipula- 
tion of individual zygotes or conceptuses dif- 
ficult. Additionally, the longer gestation time 
in mice, together with the multitude of lin- 
eages that segregate throughout its develop- 
ment, demands sustained generation of highly 
diverse barcodes with minimal unwanted over- 
writing events to maximize the chance for suc- 
cessfully recording the events of interest. 
Here, we deployed multiple independent bar- 
coding loci in parallel for robust in vivo barcod- 
ing and lineage recording in mice. We created 
a mouse line that carries a scattered array of 
60 genomically integrated homing CRISPR 
guide RNA (hgRNA) loci. hgRNAs are modified 
versions of canonical single guide RNAs (sgRNAs) 
(23) that target their own loci (Fig. 1B) to create 
a substantially larger diversity of mutants than 
canonical sgRNAs (Fig. 1C) and thus act as ex- 
pressed genetic barcodes (4). Crossing this hgRNA 
line with a Cas9 line resulted in developmen- 
tally barcoded offspring because hgRNAs sto- 
chastically accumulate mutations throughout 
gestation, generating unique mutations in each 
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lineage without deleting earlier mutations, in 
such a way that closely related cells have a more 
similar mutation profile, or barcode, than more 
distant ones. In developmentally barcoded mice, 
we extensively characterized the activity profile 
and mutant alleles of each hgRNA and carried 
out post hoc bottom-up reconstruction of the 
lineage tree in the early stage of development, 
starting with the first branches at its root and 
continuing through some of the germ layers. 
We also investigated lineage commitment with 
respect to the anterior-posterior and lateral axes 
in the brain. 


Founder mouse with multiple hgRNA loci 


We created a library of hgRNAs with four dif- 
ferent transcript lengths, variable spacer se- 
quences, and 10-base identifiers downstream 
of the hgRNA scaffold in a transposon back- 
bone (Fig. 1D) (24). This library was transposed 
into mouse embryonic stem (mES) cells under 
conditions that would result in a high number 
of integrations per cell (Fig. 1E) (24). Trans- 
fected mES cells were injected into blastocysts, 
which were then implanted in surrogate females 
to generate chimeric mice. Of the 23 chimeric 
mice that resulted, eight males were more than 
60% transgenic as assessed by their coat color 
(Fig. 1E). Five of the eight showed more than 
20 total hgRNA integrations in their somatic 
genomes and were crossed with wild-type mice 
to determine the number of hgRNAs in their 
germlines. The chimera with the highest average 
number of germline hgRNAs, which were trans- 
mitted to its progeny, was selected for further 
studies and starting a line. We refer to this 
mouse as the MARCI (Mouse for Actively Re- 
cording Cells 1) founder and its progeny as the 
MARCI line. All results described below focus 
on the MARCI founder and its progeny. 


Sequence, genomic position, and 
inheritance of hgRNA loci 


By sequencing the hgRNA loci in the MARC1 
founder, we identified 60 different hgRNAs 
(Table 1 and table S1). Each hgRNA has a unique 
10-base identifier and a different spacer sequence 
(table S1). We also sequenced the regions im- 
mediately flanking the transposed hgRNA ele- 
ments (24), which allowed us to determine the 
genomic positions of 54 of the 60 hgRNAs 
(Fig. 1F, Table 1, and table S1), of which 26 are 
intergenic and 28 are located in an intron of a 
known gene (table S3) (24); none are located in 
an exon or are expected to disrupt the gene. We 
then crossed the MARC1 founder with multiple 
females and analyzed germline transmission and 
the inheritance pattern of these hgRNAs in the 
more than 100 resulting offspring. All 60 hgRNAs 
were transmitted through the germline, and the 
offspring carrying them were fertile, had normal 
litter sizes, and presented no morphological ab- 
normalities. Of these 60 hgRNAs, 55 showed a 
Mendelian inheritance pattern, appearing in 
about 50% of the offspring (Table 1 and table S1). 
An additional 3 of the 60, all L30 hgRNAs, were 
detected in fewer than 20% of the offspring, 


1 of 12 


8L0z ‘Z Jaquiajdes uo /Hio Beweoualds‘eouelos//:dyjy Wo. papeojumoq 


RESEARCH | RESEARCH ARTICLE 


Daughter 1 Cas9:hgRNA Spacer pam Scaffold 
= ss Parental sequence 
SE -.-.-.-.------- — deletion 
7 
2 § errr steeteenetes {7EEEEEEEEE— deletion 
> £ AGA } 
= 3 SCE insertion Mutant 
7 s TTTC ‘ : ee 
pare insertion 
iM A PE ction 
Spacer PAM Scaffold  ----- es — (2.0004 
hgRNA locus 
D E = AgRNA library 
+ transposase Z 
Vv uromycin blastocyst 
fast L21 <= Prelection injection { 
© SSeeta © Soca 3 
S = 
s NNN mES cells transposed blastocyst |§ 
E us HHo HEH mES cells S 
s g 
= 
UG 
_|| ¢30- HH) S85 IMIR 88H" ooo} 
al 
in 
35-4) EEEEHHEEHED- Q@Qeeeeaee 
w ; #1 #2 #3 #4 #5 #6 #7 #8 
hgRNA library MARC1 founder 
F chr chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 
- = ' i 
mm 6 #56 a #26 #25 #16 #50 
#59 #29 
ae #23 #22 #13 #36 
#06 
#60 #31 
EZ wes #1 foe #4 
#23 #34 #48 #30 sa 
#28 #58 S 
ro aoe #10 mah #9 #40— 
a = #27 yea #05— aa 
chr11 chri12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 
45) #44 
#12 ane 
#15 #38 #09 
#17 #54 #20 #35 
#49 #46 
#57#01 #18 #43 
#32 wie 
#55 


#33— 


Fig. 1. In vivo barcoding with hgRNAs and strategy to generate a 
mouse with multiple hgRNA integrations. (A) Recording lineages using 
synthetically induced mutations in the genome. A number of loci (n) 
gradually accumulate heritable mutations as cells divide, thereby recording 
the lineage relationship of the cells in an array of mutational barcodes. 
Dashed ovals, cells; gray lines, an array of n mutating loci; colored 
rectangles, mutations. (B) Homing CRISPR system in which the 
Cas9-hgRNA complex cuts the locus encoding the hgRNA itself. As 

the NHEJ repair system repairs the cut (63), it introduces mutations in the 
hgRNA locus. (C) Example of mutations that are created in the hgRNA 
locus that can effectively act as barcodes. (D) Design of PiggyBac hgRNA 
library for creating a transgenic mouse. Four hgRNA sublibraries with 21, 
25, 30, and 35 bases of distance between transcription start site (TSS) 
and scaffold PAM were constructed and combined. The spacer sequence 


MARC1 founder genome 


(light orange box) and the identifier sequence (green box) were composed 
of degenerate bases. (E) Blastocyst injection strategy for producing 
hgRNA mice. The hgRNA library was transposed into mES cells. Cells 
with a high number of transpositions were enriched using puromycin 
selection and injected into E3.5 mouse blastocysts to obtain chimeras. 
Chimera 7 was chosen as the MARC1 founder. (F) Chromosomal position 
of all 54 hgRNAs whose genomic position was deciphered in the MARC1 
founder (red bars). Bars on the left or right copy of the chromosome 
indicate the hgRNAs that are linked on the same homologous copy. 
hgRNAs whose exact genomic position is not known but whose chromo- 
some can be determined on the basis of linkage are shown below 

the chromosome. ITR, PiggyBac inverted terminal repeats; insl, insulator; 
U6, U6 promoter; ter, U6 terminator; ID, identifier sequence; EF1, human 
elongation factor-1 promoter; puro, puromycin resistance. 


which we attribute to the low detection rate of 
L30 hgRNAs due to the performance of the 
polymerase chain reaction primer used for these 
and only these three hgRNAs (24). The remain- 
ing two hgRNAs were transmitted to almost 75% 
of the offspring—a result best explained by the 
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duplication of these hgRNAs to loci more than 
50 cM away on the same chromosome or to loci 
on different chromosomes and confirmed by 
the genomic location data (table S1) (24). 

We also compared the co-inheritance frequen- 
cies of MARC1 hgRNAs to those expected from 
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Mendelian inheritance of independently segre- 
gating loci (fig. SIA). We found no mutually ex- 
clusive cosegregating groups of hgRNAs (fig. SIA), 
indicating that the entire germline in the MARC1 
founder was derived from only one of the injected 
stem cells and is thus genetically homogeneous. 
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Fig. 2. Activity of MARC1 hgRNAs. (A) Activity profiles of all 60 hgRNAs 
in embryonic and adult progenies of the MARC1 founder crossed with 
Cas9 knock-in females, broken down by hgRNA length. The fraction of 
mutant (nonparental) spacer sequences in each hgRNA is measured. 
Lines connect the observed average mutation rates of one hgRNA. 
Means + SEM are shown (N is different for each value; see Table 2). See 
table S2 for numerical values of the plot. (B) Average activity profiles 

of each hgRNA class in embryonic and adult progenies of the MARC1 


founder crossed with Cas9 knock-in females. Means + SEM are shown 

as representations of range of activity (N is different for each value; 

see Table 2). (C) Functional categorization of hgRNAs based on their 
activity profile in (A), broken down by length. (D) Position and transcription 
direction of hgRNAs with respect to all known coding and noncoding 
genes, annotated for their functional category. See table S3 for the 
genes in which hgRNAs are located; see fig. S3 for breakdown of this 
plot by hgRNA length. 


Considering that every hgRNA detected in the 
somatic tissue of the MARC1 founder was also 
transmitted to its offspring, these results further 
suggest that almost all transgenic cells within 
this chimera were derived from one of the stem 
cells that were injected into its blastocyst, an 
observation consistent with previous studies 
(25, 26). The co-inheritance analysis also re- 
vealed the groups of hgRNAs that deviate from 
an independent segregation pattern, suggesting 
that they are linked on a chromosome (fig. S1B). 
Close examination of this linkage disequilibrium 
allowed us to determine which linked hgRNAs 
were on different homologous copies of the 
same chromosome or were linked on the same 
copy of a chromosome (Fig. 1F and fig. SIC). 
Combined with the genomic location infor- 
mation that was obtained by sequencing, this 
co-inheritance analysis allowed us to decipher 
the cytogenetic location of most hgRNAs in the 
MARCI founder with a high degree of confi- 
dence (Fig. 1F). 


Activity of hgRNAs 


We next studied the activity of MARC1’s hgRNAs 
upon activation with Cas9. For that, we crossed the 
MARCI founder with Rosa26-Cas9 knock-in females, 
which constitutively express the Streptococcus 
pyogenes Cas9 protein (27). Considering that ma- 
jor zygotic genome activation in the mouse occurs 
at the two-cell stage (28), hgRNA activation is 
expected soon after conception. We sampled 
these Cas9-activated offspring at various stages 
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after conception to measure the fraction of mu- 
tated spacers for each hgRNA. In all, we gath- 
ered 190 samples from 102 animals in seven 
embryonic stages and the adult stage (Table 2). 
The results confirm that hgRNAs start mutating 
their loci soon after the introduction of Cas9 
(Fig. 2A). However, the rate at which these mu- 
tations accumulate varied widely among the 
60 MARCI hgRNAs (Fig. 2A). On the basis of 
these activity levels, we classified hgRNAs into 
four categories with distinct activation profiles 
(Fig. 2B): (i) five “fast” hgRNAs that mutate in 
at least 80% of the cells in each sample by 
embryonic day 3.5 (E3.5) and in almost all cells by 
E8.5; (ii) 27 “slow” hgRNAs that mutate in only a 
minority of cells even in the adult stage; (iii) nine 
“mid” hgRNAs, intermediate between fast and 
slow, that accumulate mutations throughout 
embryonic development and are mutated in 
almost all cells only in later embryonic or adult 
stages; and (iv) 19 hgRNAs that appear to be in- 
active, at least with this level of Cas9 expression, 
mutating in fewer than 2% of sampled cells even 
in the adult stage (table $2). Most mutations that 
were detected (about 80% for fast hgRNAs) are 
expected to render the hgRNA nonfunctional 
and thus prevent further changes (fig. S2). 
Transcript length clearly affects hgRNA activ- 
ity: A far higher fraction of L21 hgRNAs, which 
have the shortest possible transcript length, were 
active by comparison to L25, L30, and L35 hgRNAs, 
which are longer by 4, 9, and 14 bases, respective- 
ly (Fig. 2, A and C). Furthermore, all fast hgRNAs 
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were L21 hgRNAs, whereas in longer hgRNAs the 
inactive proportion appeared to grow with in- 
creasing length (Fig. 2C). Beyond transcript 
length, we found that the variation in activity 
among hgRNAs with an identical length (Fig. 2A) 
is far more than would be expected solely on 
the basis of differences in their spacers (14), 
which suggests that genomic location may play 
a substantial role. Although we detected no sig- 
nificant difference between the activity of hgRNAs 
that are in intergenic regions relative to those 
within known genes (Wilcoxon P > 0.1), among 
hgRNAs that have landed within known coding 
and noncoding genes, those that transcribe in the 
same direction as the gene had a lower activity 
than those that transcribe in the opposite direc- 
tion (Wilcoxon P < 0.05; Fig. 2D, fig. S3, and table 
83). These observations suggest that hgRNA ac- 
tivity is affected by both genomic location and 
interplay with endogenous elements. 


Diversity and composition 
of hgRNA mutants 


We next analyzed the diversity produced by 
MARCI hgRNAs by considering all observed 
mutant spacer alleles in MARCI x Cas9 offspring 
(table S4). Only a handful of mutant spacer al- 
leles were detected for each hgRNA in each 
sample (Fig. 3A and fig. S4A). However, when 
combining mutant spacers from all offspring, 
on average, more than 200 distinct mutant spac- 
ers for each fast hgRNA and more than 300 for 
each mid hgRNA were observed (Fig. 3B and 
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Fig. 3. Diversity of mutant hgRNA alleles in offspring of MARC1 x 


vipaunien length (bp) 


Cas9 cross. (A) Beanplots of the number of mutant spacer alleles 


observed in each mouse for each hgR 
mark the average for each hgRNA in the category; long horizontal lines 
mark the average of all the hgRNAs in the category. See fig. S4A for a 
separate plot for each hgRNA. (B) Beanplots of the total number of mutant 


A category. Short horizontal lines 


(F) Distribution of deletion length for unique and recurring mutant spacer 
alleles. Deletions larger than 30 bp have been aggregated. (G) Schematic 
representation of how five distinct deletion events can lead to the same 
mutant spacer allele. (H) Distribution of deletion redundancy—that is, the 
number of independent simple deletion events in the parental spacer allele 
that would lead to the same observed deletion mutant—for unique and 


spacer alleles observed for each hgR 


A in all mice. See fig. S4B for a 


separate plot for each hgRNA. (C) Histogram (red bars) and cumulative 


fraction (blue connected dots) of the 


number of mice in which each 


mutant allele was observed, combined for all hgRNAs. See fig. S5 for a 


separate plot for each hgRNA. (D) Re 


ative ratio of recurring mutant 


recurring spacer alleles. Simple deletion is defined as deletion of a contiguous 
stretch of bases without creating insertions or mismatches. Redundancy of 
O represents non—simple mutant alleles, which involve insertions, mis- 
matches, or noncontiguous deletions. (1) Distribution of insertion length for 
unique and recurring mutant spacer alleles. Insertions of 20 bp or longer 


spacer alleles (fig. S5) (24) to the unique alleles. (E) Mutation types in 
unique (top) and recurring (bottom) spacer alleles. See tables S4 and S5 
for the sequences and alignment of all mutants and recurring 

mutants, respectively, and fig. S6 for a separate plot for each hgRNA. 


fig. S4B). Furthermore, about 80% of all mutant 
spacer alleles were unique observations in a 
single offspring (Fig. 3C and fig. S5), which 
suggests that the mutant alleles observed with 
our sampling level constitute only a minority of 
all mutant spacers possible. These results in- 
dicate that each hgRNA can produce hundreds 
of mutant alleles. 

Notably, although most mutant spacer al- 
leles appeared in only a single sample, about 
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6% recurred in multiple MARCI x Cas9 offspring 
(Fig. 3, C and D, and fig. S5). To understand this 
phenomenon, we compared the nature of unique 
and recurring mutant alleles (tables S4 and S5). 
We observed that insertions or deletions (indels) 
underlie the vast majority of alleles in both 
unique and recurring mutations (Fig. 3E and 
fig. S6, A and B). The exact nature of these indel 
mutations, however, differs. First, short deletions 
of 23 base pairs (bp) or fewer are enriched in the 
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have been aggregated. (J) Four observed examples of recurring single-base 
insertions, involving duplication of the —4 position, for four different hgRNAs. 
(K) Schematic representation of how a single-base staggered overhang 
generated by Cas9 can lead to duplication of the —4 position. 


recurring alleles (Fig. 3F). Interestingly, these 
mutant alleles can be identical results of multiple 
different simple deletions in the parental spacer 
sequence (Fig. 3, G and H), which implies that 
this group of recurring mutations can result from 
distinct mutagenesis events that lead to the same 
sequence. Second, single-base insertions are dras- 
tically enriched among recurring insertion mutants 
(Fig. 31). A closer examination of these single- 
base insertions revealed that many follow the same 
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Fig. 4. In vivo barcoding in mouse embryos. (A) Barcode depiction for 
each hgRNA in each sample. Each column corresponds to an observed 
mutant spacer; each row corresponds to a sample. The color of each block 
represents the observed frequency of the corresponding mutant spacer 
in the corresponding sample. (B) In vivo—generated barcodes of three fast 
and three mid hgRNAs in eight embryos from a MARC1 x Cas9 cross. 
Four tissues were sampled from each embryo: the placenta (P), the yolk 
sac (Y), the head (H), and the tail (T). Embryos 1 and 2 were obtained 

at E16.5, whereas embryos 3 to 8 were obtained at E12.5 (Table 2). For 
each hgRNA, the results for a maximum of four embryos are shown. Full 


pattern: duplication of the base at the -4 posi- 
tion relative to the protospacer adjacent motif 
(PAM) (Fig. 3J). In fact, this type of insertion 
was recurring in 34 of the 41 active hgRNAs. 
This observation can be best explained by Cas9 
cutting at the -4 position of the noncomplemen- 
tary strand and at the -3 position of the comple- 
mentary strand, thus creating a staggered end 
with a 5’ overhang, which is then filled in on 
both ends and ligated (Fig. 3K). Therefore, our 
results suggest that Cas9 can produce staggered 
cuts, and that the nature of these cuts, together 
with the sequence of the target site, affects the 
eventual outcome of nonhomologous end joining 
(NHEJ) repair. 


Developmental hgRNA barcodes 


The results thus far indicate that MARCI hgRNAs 
accumulate mutations upon activation with Cas9 
nuclease after conception. We next queried wheth- 
er these mutations indeed reflect developmen- 
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tal events. For simplicity, we focused on fast 
and mid hgRNAs in eight post-E12 MARCI x 
Cas9 offspring for which four different tissues 
had been sampled (Table 2). The sampled tis- 
sues were the placenta, the yolk sac, the head, 
and the tail. The barcode was defined for each 
hgRNA in each sample as the frequency vector 
of the relative abundances of all observed mu- 
tant alleles (Fig. 4A). For the 32 samples under 
consideration (eight conceptuses with four sam- 
ples each), these barcodes showed diverse and 
complex patterns, with each sample having a 
unique barcode but with varying degrees of 
similarity to other samples (Fig. 4B and fig. $7). 
To compare the hgRNA barcodes between sam- 
ples, we used a scaled Manhattan distance (L1) 
of their frequency vectors, such that a distance 
of 100 would indicate a completely nonoverlap- 
ping set of mutant alleles and a distance of 0 
would indicate a complete overlap of mutant 
alleles with identical relative frequencies (24). 
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barcodes for all hgRNAs are in fig. S7. The color code is as shown in (A). 
Only mutant alleles with a maximum abundance of more than 1% are 
shown. (C) Histogram of the scaled Manhattan distances (L1) between the 
barcodes of all possible sample pairs for each hgRNA, broken down by 
sample pairs belonging to the same embryo (blue) and pairs belonging to 
different embryos (orange). (D) The complete barcode, composed of 

the concatenation of all hgRNA barcodes, for embryos 1 and 2. (E) Heat 
map of the average Manhattan distance between the full barcodes of 
placenta, yolk sac, head, and tail samples in all eight embryos. For a 
separate map for each embryo, see fig. S8. 


Pairwise comparison of all hgRNA barcodes 
among all samples (Fig. 4C) showed that more 
than 99% of barcode pairs have a scaled 
Manhattan distance of more than 5, indicating 
unique barcoding of each sample by each hgRNA. 
Furthermore, barcodes from different tissues 
of the same embryo were more similar to each 
other (median distance = 41) and more distinct 
from different embryos (median distance = 78) 
(Fig. 4C), which suggests that barcodes may re- 
cord information about the history of samples 
relative to one another. 

To further evaluate this recording of sample 
histories, we created a “full” barcode for each 
sample by combining the barcodes generated 
by each of its hgRNAs (Fig. 4D) and compared 
the distance between these barcodes in the four 
tissues obtained from each embryo (Fig. 4E and 
fig. S8). The results show higher similarity be- 
tween the head and tail samples, which together 
are the most different from placenta. The samples 
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Fig. 5. Lineage derivation based on hgRNA-generated developmental 
barcodes. (A) Summary of the earliest lineages in mouse. (B) Schematic 
representation of a blastocyst and an E12.5 mouse conceptus, 
color-coded according to the origin of tissues in the blastocyst. Black 
dots show the positions and tissues of the samples obtained from 
E12.5 conceptuses. (©) Summary of how hgRNA barcodes were 
compiled for each sample. Each bar represents a mutant spacer of an 
hgRNA, and its color represents its abundance relative to other mutant 
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spacers of the same hgRNA in the same sample. (D) Full hgRNA 
barcodes for all samples from the four mouse embryos analyzed. The 
barcode is annotated in (C). Only mutant alleles with a maximum 
abundance of more than 2% are shown. Deep pink bars below each map 
mark highly recurring alleles that have been observed in more than 
60% of all mice analyzed in Table 2. See table S6 for a numerical 
version of each barcode map. (E) Lineage tree for each embryo 
calculated from the barcodes in (D). 


obtained here represent mixed and overlapping 
lineages. However, considering that the head and 
tail are derived from the inner cell mass (ICM) 
whereas the placenta is mostly derived from the 
trophectoderm (29-37), these results suggest that 
hgRNA barcodes of different tissues embody their 
lineage histories. 
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First lineage tree from 

barcode recordings 

We next assessed whether accurate lineage trees 
can be constructed de novo from developmen- 
tally barcoded mice. To assess this potential, we 
focused on the tree of the first lineages in de- 
velopment. The first lineage segregation events 
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in mammals are the differentiation of blastomeres 
into the trophectoderm and ICM before E3.5, 
followed by differentiation of the ICM into the 
primitive endoderm and epiblast by E4.5 (Fig. 
5A) (29). To reconstruct this lineage tree, we 
used developmentally barcoded E12.5 conceptuses 
and sampled two distinct tissues from each of 
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Fig. 6. Lineage tree derivation robustness and contribution of each 
hgRNA. (A) The correct unrooted tree topology for the earliest lineages in 
mouse. Arrows indicate all possible roots. The empty arrow indicates the 
perfect root. (B) The perfect rooted topology and an example from each of the 
other topology classifications. The colored boxes below each topology 
constitute the color key for the remaining panels of the figure. (©) For each of 
the four embryos analyzed, distribution of tree calculation outcomes from 

all possible subsets of hgRNAs (2” — 1 non-null subsets for an embryo with 
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three lineages: the decidual zone (DZ) and the 
junctional zone (JZ) of the placenta, which are 
descendants of the trophectoderm (30, 37); the 
parietal endoderm (PE) and visceral endoderm 
(VE) of the yolk sac, which are descendants of 
the primitive endoderm; and the heart and a 
limb bud of the embryo proper, which are de- 
scendants of the epiblast (29) (Fig. 5B). We then 
assembled the full barcode for each sample 
(Fig. 5, C and D, and table S6) and, using their 
Manhattan distances, clustered them to form a 
tree for each embryo (Fig. 5E) (24). Remark- 
ably, despite the differences in the number and 
composition of hgRNAs inherited, the resulting 
tree perfectly matched the expected lineage in 
all four embryos, showing that the DZ and JZ 
form one clade of the tree while the other clade 
comprises two subclades, one with PE and VE 
and the other with the heart and limb bud 
(Fig. 5E). These results demonstrate that accu- 
rate lineage trees can be constructed from de- 
velopmentally barcoded mice. 

We next evaluated the robustness of lineage 
tree derivation from hgRNA barcodes by calcu- 
lating the tree topology with only parts of the 
full barcodes. For a bifurcating tree with six 
tips (limb, heart, VE, PE, JZ, and DZ; Fig. 6A), 
945 distinct rooted topologies are possible (32). 
Only a single one of these 945 tree topologies 
perfectly matches the expected lineage tree; we 
refer to this topology as “perfect” (Figs. 5E and 
6B). Another eight topologies would be correct 
if unrooted—that is, if all four clades are cor- 
rectly assigned but the root is misplaced because 
a branch other than the one connecting the (DZ, 
JZ) clade to the ((PE, VE), (heart, limb)) clade is 
the longest (Fig. 6A). We refer to these topologies 
as “correct” (Fig. 6B). If three, two, or fewer than 
two of the four clades have been assigned cor- 
rectly, we consider the topologies as “incomplete,” 
“partial,” and “wrong,” respectively (Fig. 6B). 
With these distinctions, we evaluated the trees 
generated with all possible non-null subsets of 
the hgRNAs in each embryo. The results show 
that, depending on the embryo, 60% to 85% of 
all possible hgRNA subsets result in a perfect 
or correct topology (Fig. 6C), which compares 
favorably to the ~1% chance of randomly finding 
such topologies. With only three hgRNAs, more 
than 50% of all derived trees have a correct 
or perfect topology for each embryo (Fig. 6D). 
Furthermore, calculated topologies improve with 
increasing the number of hgRNAs (Fig. 6D). Com- 
bined, these results show that lineage tree deri- 
vation from in vivo-generated hgRNA barcodes 
is robust and that the use of a higher number of 
hgRNAs results in more reliable outcomes. 

We then examined the contribution of each 
hgRNA to deriving the correct tree topology for 
each embryo. We defined the “impact score” of 
an hgRNA in each embryo’s early lineage tree 
as the difference between the fraction of all 
correct and perfect topologies in which the 
hgRNA was considered and the fraction of 
all wrong and partial topologies in which the 
hgRNA was considered (Fig. 6E and figs. S9 
and S10) (24). As such, an impact score of +1 
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would indicate that whenever the hgRNA was 
included in tree derivation, a correct or a per- 
fect topology was obtained, and no such topol- 
ogies were obtained without that hgRNA. An 
impact score of -1 would indicate that when 
the hgRNA was included in tree derivation, only 
partial or wrong topologies were obtained. Values 
between +1 and -1 define the range between 
those entirely constructive or destructive out- 
comes, with an impact score of 0 indicating 
that the likelihood of obtaining a correct to- 
pology is the same with or without the hgRNA. 
Impact scores for hgRNAs in our four embryos 
show a positive average contribution by all three 
active hgRNA classes with slow hgRNAs, which 
are largely unmutated early in development 
(Fig. 2B), having an average impact close to 0, 
and mid and fast hgRNAs, which are active 
early in development (Fig. 2B), having increas- 
ingly positive impacts on the derivation of the 
correct tree (Fig. 6E). In fact, only three fast and 
mid hgRNAs suffice to obtain a correct or perfect 
topology in more than 90% of all derived trees 
(Fig. 6F). By contrast, exclusive use of slow hgRNAs 
does not recover the early lineage tree as reliably 
(Fig. 6F). Combined, these results suggest that 
active mutagenesis during a differentiation event 
allows it to be recorded. They also suggest that 
when the developmental stage in which a lin- 
eage differentiates is known, hgRNA activity 
profiles (Fig. 2A) can aid in choosing the ap- 
propriate hgRNAs such that correct trees can 
be reliably obtained with just a few hgRNAs. 

Interestingly, when only slow hgRNAs were 
considered in tree construction for early lineages, 
increasing the number of hgRNAs still resulted 
in improved outcomes (Fig. 6F). This observa- 
tion suggests that even when hgRNAs have low 
activity levels at the time an event is being re- 
corded, partial recordings from multiple hgRNAs 
can be combined to obtain a more complete rec- 
ording. As another example, four different hgRNAs 
from embryo 3 predict a partial tree when con- 
sidered on their own, yet the perfect tree is 
derived when all four are considered together 
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(fig. S11), further supporting the integrability 
of hgRNA recordings. 

In two of our lineage-analyzed embryo sam- 
ples (Fig. 5), we noted several hgRNAs in which 
all ICM-derived tissue samples (PE, VE, heart, 
limb) were dominated by a single mutant allele, 
whereas the corresponding trophectoderm- 
derived tissue samples (DZ, JZ) displayed a more 
uniform distribution of multiple mutant alleles 
(Fig. 7). These profiles suggest that in these em- 
bryos, these hgRNAs mutated as the troph- 
ectoderm and ICM lineages differentiated, and 
that fewer blastomeres led to the ICM than to 
the trophectoderm. These observations are con- 
sistent with previously reported observations 
(33, 34) and suggest that hgRNA mutation pro- 
files could be used to measure both the rela- 
tionship between lineages and the relative number 
of cells that seed lineages. 


Axis development in the brain 


We next used developmentally barcoded mice 
to address lineages above the first lineages in 
the tree, with a focus on the establishment of the 
anterior-posterior (A-P) axis versus the lateral 
(L-R) axis in the brain. Patterning of the nervous 
system and its progenitors starts in gastrulation 
(E6.5) when the embryo has radial symmetry 
(35, 36). By E8.5, both A-P and L-R axes are 
established in the neural tube (Fig. 8A); how- 
ever, it remains unclear which axis is established 
first (37, 38). At a morphological level they ap- 
pear concurrently (39), and previous single-cell 
labeling and tracing experiments carried out 
ex vivo do not adequately address the issue (40). 
We analyzed two developmentally barcoded mice 
in the adult stage. In one, we dissected the left 
and right cortex and cerebellum, while in the 
other we additionally dissected the tectum. The 
cortex, tectum, and cerebellum respectively orig- 
inate from embryonic forebrain (prosenceph- 
alon), midbrain (mesencephalon), and hindbrain 
(rhombencephalon) vesicles of the neural tube 
(Fig. 8A). From each region, two samples of neu- 
ronal nuclei were sorted (24). We also obtained 
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Fig. 7. Trophectoderm and ICM barcodes show differences in their number of mutant hgRNA 
alleles. Five barcodes from two embryos in Fig. 5D are shown that distinguish trophectoderm- 
derived and ICM-derived samples. Deep pink bars below each map mark highly recurring alleles 
that have been observed in more than 60% of all mice analyzed in Table 2. See table S6 for a 
numerical version of each barcode map. Only mutant alleles with a maximum abundance of more 


than 2% are shown. 


31 August 2018 


8 of 12 


8L0z ‘Z Jaquiajdes uo /Hio Beweoualds eoualos//:dyjy Wo. papeojumoq 


RESEARCH | RESEARCH ARTICLE 


samples of the blood and muscle from each 
mouse, both mesoderm-derived, to serve as out- 
groups. We then assembled the full barcode 
for each sample and applied clustering as before 
(Fig. 8, B and C, and fig. S12). In addition to seg- 
regating the mesoderm- and ectoderm-derived 
cells, the results clearly show that neurons from 
the left side of each brain region are more closely 
related to neurons from the right side of the 
same region than they are to neurons from either 
of the other two regions. Considering that no 
extensive migration of neuronal cell bodies be- 
tween the regions sampled here has been reported 
(41), these results suggest that commitment to 
the A-P axis is established before commitment 
to the L-R axis in development of the central 
nervous system. 

Similar to the first lineage tree analysis above 
(Fig. 6), we evaluated the robustness of the brain 
axis tree derivation as well as the contribution of 
each hgRNA in mouse 2 (Fig. 8, D and E) (24). 
We assigned topologies with all three left and 
right sample pairs placed closest to one another 
as correct, and those with two, one, or zero pairs 
placed as incomplete, partial, and wrong, respec- 
tively. We then calculated the distribution of 
tree derivation outcomes with all possible sub- 


sets of active hgRNAs in mouse 2 (Fig. 8D). The 
results show that half of the combinations with 
only three hgRNAs derive a correct or partially 
correct topology, a ratio that only improves 
when including more hgRNAs. We also calcu- 
lated the impact score of each hgRNA (Fig. 8E) 
(24). Relative to impact scores for the first lin- 
eage tree (Fig. GE), we found smaller contribu- 
tions by fast hgRNAs, which would be expected 
for lineages that segregate much later in devel- 
opment. Taken together, these results demon- 
strate that lineages across diverse developmental 
times are recorded in our developmentally bar- 
coded mice and can be extracted. 


Discussion 


In this study, we created an hgRNA mouse line 
for in vivo barcoding and used it to generate 
developmentally barcoded mice in which lineage 
information is recorded in cell genomes and can 
be extracted and reconstructed. Our strategy to 
create the MARCI line was designed to address 
challenges associated with in vivo barcoding in 
a mouse model. First, genetic manipulation of 
individual mouse embryos is more challenging 
than that of lower vertebrates. Therefore, a line 
with genomically integrated, stable, and heri- 


table barcoding elements that can be activated 
by simply crossing with other lines is powerful, 
versatile, and shareable (24). Second, tracking 
development in mice demands that the system 
be capable of generating a great many barcodes 
with little overwriting or deletion (7/4). As such, 
we scattered hgRNAs throughout the genome 
instead of using a contiguous array, circumvent- 
ing large deletion events that can occur with 
multiple adjacent cut sites (42-44) and can 
remove prior recordings. In fact, we estimate 
that less than 1% of all mutations resulted in 
unidentifiable alleles by removing amplification 
primer binding sites or all unique sequences 
(24). The scarcity of these unwanted deletion 
events led to a great success rate in analyzing 
barcoded mice (4/4 in Fig. 5, 2/2 in Fig. 8). 
Furthermore, as hgRNA loci in this scattered 
array accumulate mutations independently, their 
mutant alleles combine exponentially to create a 
large diversity of barcodes. Consequently, the 41 
active MARCI hgRNAs can in theory combine to 
create more than 10” different barcodes [[]#,7:, 
where 7, is the total number of observed mutant 
alleles for hgRNA 7 in this study, which is likely an 
underestimation (see above)]. Even only five fast 
and five mid hgRNAs can combine for roughly 


Table 1. hgRNAs in the MARC1 founder male. TSS-to-PAM length, observed inheritance probability, and 


tables Sl to S3 for more details. 
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107? different barcodes (200° x 300°, where 
200 and 300 are the observed average number 
of mutant alleles for fast and mid hgRNAs, re- 
spectively). This remarkable diversity is ade- 
quate for uniquely barcoding every one of the 
~10"° cells in a mouse. Furthermore, assuming 
a perfect binary developmental tree as a first- 
order approximation, this diversity is adequate 
for uniquely marking all the ~2 x 10'° internal 
and terminal nodes of the mouse developmen- 
tal tree. 

Close analysis of the nature of mutant alleles 
in hgRNA barcodes showed the interplay be- 
tween target site sequence and Cas9-induced 
double-strand breaks that determines the pos- 
sible NHEJ outcomes (Fig. 3). Specifically, short 
indels underlie recurring NHEJ outcomes. No- 
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table among these is a recurring duplication of 
the base at the -4 position relative to the PAM 
in a majority of active hgRNAs. The most likely 
explanation for this observation is Cas9 creat- 
ing staggered cuts that produce a single-base 
5’ overhang (Fig. 3K), because a terminal trans- 
ferase activity would not duplicate the base ad- 
jacent to the cut site, and RuvC exonuclease 
activity on the noncomplementary strand would 
not result in an insertion at all. Whether Cas9 
creates blunt or staggered ends in vivo has been 
a subject of debate. Our observation in mice, com- 
bined with a recent report in yeast (45) and pre- 
vious in vitro and in vivo evidence (44, 46-50), 
clarifies that Cas9 can create staggered ends as 
well as blunt ends, although the ratio of the two 
is unknown as of yet. 
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By crossing the MARCI line with a line that 
constitutively expresses Cas9, we generated 
developmentally barcoded mice in which lineage 
information is recorded in the hgRNA barcodes. 
We were able to reconstruct parts of the lineage 
tree using these mice, with the first branches 
that emerge after the zygote, up to some of the 
germ layer, neuroectoderm, and the neural tube 
branches (Figs. 5 and 8). We find remarkable 
robustness and flexibility in these recordings 
(Figs. 6 and 8D). Specifically, there is overlap in 
recordings made by various hgRNAs, and there- 
fore the derived lineage tree is robust to remov- 
ing any part of the barcode. Furthermore, partial 
nonoverlapping recordings from different hgRNAs 
can be integrated to reconstruct a complete tree. 
Combined with evidence of sustained hgRNA 
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Fig. 8. The anterior-posterior axis is established before the left-right axis in the development of the brain. (A) Dorsal view of the neural tube 

and superior view of the adult brain in mouse. The primary brain vesicles in the neural tube and their corresponding structures in the adult brain are 
shown. (B and C) Calculated trees based on hgRNA barcodes in two adult mice. See fig. S12 for the full barcodes. (D) Distribution of tree calculation 
outcomes for mouse 2 when only including m of the n hgRNAs ("C,, combinations). Only hgRNAs with at least a 7% mutation rate in one of the samples 
were considered. (E) Impact score of each hgRNA in the early lineage tree of mouse 2. 


Table 2. Breakdown of all mice used for hgRNA activity analysis according to developmental stage and number of samples obtained per mouse. 


Samples per mouse 


E3.5 E6.5 E7.5 


Stage of development 
E10.5 
1 


E8.5 E12.5 E14.5 E15.5 


Total 
mice 


Total 


samples 


E16.5 Adult 
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mutagenesis throughout gestation, these results 
suggest that developmentally barcoded mice 
embody information from various stages of 
development in embryonic and extraembryonic 
tissues. Extracting such information will be a 
matter of the type of question being investigated 
and an ability to isolate cells from relevant lin- 
eages. Another interesting possibility is to create 
different types of barcoded mice by crossing the 
MARCI line with other S. pyogenes Cas9 lines. 
Among these are inducible Cas9s (57, 52), ones 
with different activity levels (53), tissue- or 
lineage-specific versions based on Cre drivers 
(27, 53), or base-editing Cas9s (54, 55). Such 
barcoded mice may enhance the capabilities of 
the system, overcome its shortcomings (24), or 
better focus its potential on specific problems. 

Our results provide a platform for in vivo 
barcoding and lineage tracing in the mouse. 
Although we have focused here on the record- 
ing aspect of in vivo developmental barcoding, 
more effective readout strategies—in particular, 
those with transcriptome-coupled single-cell 
readouts (19, 27, 22) or with in situ readouts 
(56)—will be necessary. Finally, in addition to 
lineage-tracing applications, this platform may 
also be applied to recording cellular signals over 
time (16, 17, 57-59) and uniquely barcoding each 
cell in a tissue or an organism for identification 
purposes, such as for connectome mapping in 
the brain (60-62). 


Methods summary 


All animal procedures were approved by the 
Harvard University Institutional Animal Care 
and Use Committee (IACUC). For embryonic 
samples, the MARCI founder was crossed with 
a Cas9 knock-in female. Pregnant females were 
then dissected at the desired embryonic time 
points, designating noon of the day of vaginal 
plug detection as EO.5. For isolating neurons 
from adult barcoded female mice, brains were 
dissected into the regions of interest and ho- 
mogenized. Nuclei were isolated from the ho- 
mogenate by gradient ultracentrifugation, labeling 
with a NeuN antibody, and sorting the NeuN- 
positive fraction in flow cytometry. From all 
obtained samples, DNA was extracted and am- 
plified with specific primers for hgRNA loci. 
The resulting amplicons were sequenced with 
paired ends and analyzed to identify the hgRNA 
itself according to the identifier sequence, and 
the mutant allele according to the spacer se- 
quence. The sequencing results were processed 
and filtered to obtain a list of high-confidence 
unique spacer-identifier pairs observed in each 
sample and their respective abundances. For 
obtaining lineage trees, these lists were converted 
into frequency matrices and clustered hierarchi- 
cally using Ward’s criterion. All procedures for the 
experiments and data analyses are described in 
detail in the supplementary materials. 
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The yellow fever virus (YFV) epidemic in Brazil is the largest in decades. The recent 
discovery of YFV in Brazilian Aedes species mosquitos highlights a need to monitor the 
risk of reestablishment of urban YFV transmission in the Americas. We use a suite of 
epidemiological, spatial, and genomic approaches to characterize YFV transmission. 

We show that the age and sex distribution of human cases is characteristic of sylvatic 


transmission. Analysis of YFV cases combined with genomes generated locally reveals an 
early phase of sylvatic YFV transmission and spatial expansion toward previously YFV-free 


areas, followed by a rise in viral spillover to humans in late 2016. Our results establish a 
framework for monitoring YFV transmission in real time that will contribute to a global 
strategy to eliminate future YFV epidemics. 


ellow fever (YF) is responsible for 29,000 to 
60,000 deaths annually in South America and 
Africa (2) and is the most severe mosquito- 
borne infection in the tropics (2). Despite 
the existence of an effective YF vaccine since 
1937 (3), an estimated >4.00 million unvaccinated 
people live in areas at risk of infection (4). Yellow 


fever virus (YFV) is a member of the Flaviviridae 
family and is classified into four genotypes: East 
African, West African, South American I, and 
South American II (5-9). In the Americas, YFV 
transmission occurs mainly via the sylvatic cycle, 
in which nonhuman primates (NHPs) are in- 
fected by tree-dwelling mosquito vectors such 


as Haemagogus spp. and Sabethes spp. (10, 11). 
YFV transmission can also occur via an urban 
cycle, in which humans are infected by Aedes spp. 
mosquitoes that feed mostly on humans (72, 13). 

Brazil has recently experienced its largest- 
recorded YF outbreak in decades, with 2043 
confirmed cases and 676 deaths since December 
2016 (supplementary text and fig. $1) (/4). The 
last YF cases in Brazil attributed to an urban 
cycle were in Sena Madureira, in the northern 
state of Acre, in 1942 (75). An intensive eradica- 
tion campaign eliminated Aedes aegypti and YF 
from Brazil in the 1950s (16), but the vector be- 
came reestablished in the 1970s and Aedes spp. 
mosquitoes are now abundant across most of 
Brazil (17). The consequences of a reignition of 
urban cycle transmission in Brazil would be se- 
rious, as an estimated 35 million people in areas 
at risk for YFV transmission in Brazil remain 
unvaccinated (4). New surveillance and analyt- 
ical approaches are therefore needed to monitor 
this risk in real time. 


Yellow fever virus outbreak in Brazil, 
2016-2017 


Between December 2016 and the end of June 
2017, there were 777 polymerase chain reaction 
(PCR)-confirmed human cases of YF across 10 
Brazilian states—mostly in Minas Gerais (MG) 
(60% of cases), followed by Espirito Santo (32%), 
Rio de Janeiro (3%), and Sao Paulo (3%) (18). The 
fatality ratio of severe YF cases was estimated at 
33.6%, comparable to previous outbreaks (19, 20). 
Despite the exceptional magnitude and rapid ex- 
pansion of the outbreak, little is known about its 
genomic epidemiology. Further, it is uncertain 
how the virus is spreading through space, as well 
as between humans and NHPs, and analytical 
insights into the contribution of the urban cycle 
to ongoing transmission are lacking. 

To characterize the 2017 YFV outbreak in 
Brazil, we first compared time series of con- 
firmed cases in humans (n = 683) and NHPs 
(n = 313) reported until October 2017 by public 
health institutes in MG, the epicenter of the 
outbreak (Fig. 1, A and B, and fig. S2). The time 
series are strongly associated (cross-correlation 
coefficient = 0.97; P < 0.001). Both peak in late 
January 2017, and we estimate that human cases 
lag behind those in NHPs by 4 days (table S1). 
NHP cases are geographically more dispersed 


‘Department of Zoology, University of Oxford, Oxford, UK. ?Computational Epidemiology Lab, Bos 


on Children’s Hospital, Boston, MA, USA. “Department of Pediatrics, Harvard Medical School, 


Boston, MA, USA. “Laboratorio de Flavivirus, Instituto Oswaldo Cruz, FIOCRUZ, Rio de Janeiro, Brazil. “Laboratério de Virologia Molecular, Departamento de Genética, Instituto de Biologia, 


*These authors contributed equally to this work. 


+Corresponding author. Email: nuno.faria@zoo.ox.ac.uk (N.R.F.); luiz.alcantara@ioc.fiocruz.br (L.C.J.A.); oliver.pybus@zoo.ox.ac.uk (0.G.P.) 


Faria et al., Science 361, 894-899 (2018) 


Paris, France. °CNRS UMR2000: Génomique Evolutive, Modélisation 
Laboratory Medicine and Medical Sciences, University of KwaZulu-Na 


Epidemioldgica do Estado do Rio de Janeiro, Rio de Janeiro, Brazil. °“ 


31 August 2018 


Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil. “Laboratério Central de Satide Publica, Instituto Octavio Magalhaes, FUNED, Belo Horizonte, Minas Gerais, Brazil. ’Instituto de 
Ciéncias Biolégicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil. "Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK. °Department of 
Microbiology and Immunology, Rega Institute, KU Leuven, Leuven, Belgium. ‘Department of Statistics, University of Oxford, Oxford, UK. “The Global Health Network, University of Oxford, Oxford, 
UK. ‘Department of Statistics, Harvard University, Cambridge, MA, USA. !°Malaria Atlas Project, Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK. “Faculty of 
Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK. Mathematical Modelling of Infectious Diseases and Center of Bioinformatics, Institut Pasteur, 
et Santé, Institut Pasteur, Paris, France. !’KwaZulu-Natal Research, Innovation and Sequencing Platform (KRISP), School of 

al, Durban, South Africa. Secretaria de Estado de Satide de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil. ?Nucleo de 
Doengas de Transmissao Vetorial, Instituto Adolfo Lutz, Sao Paulo, Brazil. “Instituto de Medicina Tropical e Faculdade de Medicina da Universidade de Sao Paulo, Sao Paulo, Brazil. ‘Retrovirology 
Laboratory, Federal University of Sao Paulo, Sao Paulo, Brazil. °School of Medicine of ABC (FMABC), Clinical Immunology Laboratory, Santo André, Sao Paulo, Brazil. “*Coordenacao de Vigilancia 
Departamento de Vigilancia das Doencas Transmissiveis da Secretaria de Vigilancia em Satide, Ministério da Satide, Brasilia-DF, 
Brazil. °Secretaria de Vigilancia em Satide, Coordenacao Geral de Laboratorios de Satide Publica, Ministério da Satide, Brasilia-DF, Brazil. °Organizacao Pan - Americana da Satide/Organizacéo Mundial da 
Satide - (OPAS/OMS), Brasilia-DF, Brazil. ’Public Health England, National Infections Service, Porton 
London, UK. “Centre for the AIDS Programme of Research in South Africa (CAPRISA), Durban, South Africa. ?°-Department of Biostatistics, UCLA Fielding School of Public Health, University of California, 
Los Angeles, CA, USA. Department of Biomathematics and Human Genetics, David Geffen School o 


Down, Salisbury, UK. “°NIHR HPRU in Emerging and Zoonotic Infections, Public Health England, 


Medicine at UCLA, University of California, Los Angeles, CA, USA. 


1 of 6 


8L0z ‘Z Jaquiajdes uo /Hio Beweoual9seoualos//:dyjy Wo. papeojumoq 


RESEARCH | RESEARCH ARTICLE 


Fig. 1. Spatial and temporal epidemiology 

of YFV and CHIKV in Minas Gerais (MG). 

(A) Time series of human (H) YFV cases in MG 
(676 cases across 61 municipalities)—confirmed 
by serology, reverse transcription quantitative 
PCR (RT-qPCk), or virus isolation—during the 
first YFV epidemic wave (August 2016 to 
October 2017). (B) Same as in (A) but 
showing NHP YFV cases (313 cases across 

90 municipalities) confirmed by RT-qPCR. 

(C) Same as in (A) but showing human CHIKV 
cases (3668 cases across 129 municipalities). 
(D) Geographic distribution of human YFV 
cases in MG. (E) Geographic distribution 

of NHP YFV cases in MG. Figure S3 shows the 
corresponding geographic distribution of 
CHIKV cases. (F) Association between the 
number of human and NHP cases in each 


municipality of MG (Pearson’s r = 0.62; 
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in MG than human cases, which are more con- 
centrated in the Tedfilo Otoni and Manhuacu 
municipalities (Fig. 1, D and E). Despite this, the 
numbers of human and NHP cases per munic- 
ipality are positively correlated (Fig. 1F). 

To establish whether human cases are ac- 
quired in proximity to potential sources of 
sylvatic infection, we estimated the distance 
between the municipality of residence of each 
human case and the nearest habitat of po- 
tential transmission, determined by using the 
enhanced vegetation index (EVI) (22) (supple- 
mentary materials). The average minimum dis- 
tance between areas with EVI > 0.4 and the 
residence of confirmed human cases is only 
5.3 km. In contrast, a randomly chosen resident 
of MG lives, on average, >51 km away from areas 
with EVI > 0.4. Similarly, human YFV cases 
reside, on average, 1.7 km from the nearest NHP 
case, whereas the mean minimum distance of 
a randomly chosen MG resident to the nearest 
NHP case is 39.1 km. This is consistent with YF 
infection risk being greatest for people who re- 
side or work in forested areas where sylvatic 
transmission occurs. We find that most human 
cases (98.5%) were reported in municipalities 
with estimated YFV vaccination coverage above 
the 80% threshold recommended by the World 
Health Organization (WHO). On average, human 
cases would need to travel 65 km from their place 
of residence to reach an area where vaccina- 
tion coverage is <80% (4). 


Risk of YFV urban transmission 


YFV was detected in Aedes albopictus mosqui- 
toes caught in MG in January 2017 (22). Further, 
experiments suggest that Aedes spp. mosqui- 
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YFV(H) cases per municipality 


P < 0.0001; nonparametric Spearman's rank 
p = 0.32; P < 0.05). 


Fig. 2. Age and sex 
distribution of YFV 
cases in MG, 2016-2017. 
Red bars show the 
proportion of observed 
YFV cases in MG that 
occur in each age class, 
in (A) males and (B) 
females. These empirical 
distributions are different 
from those predicted 
under two models (M1, 
pale blue bars; M2, 
orange bars) of urban 
cycle transmission 

(see text for details). 
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toes from southeast Brazil can transmit Brazil- 
ian YFV, although perhaps less effectively than 
vectors from elsewhere in the country (23, 24). 
It is therefore important to investigate whether 
YFV cases in MG occur where and when Aedes 
spp. vectors are active. To do so, we analyzed 
confirmed chikungunya virus (CHIKV) cases 
from MG (Fig. 1C). 

CHIKV is transmitted by the urban mosqui- 
toes Ae. aegypti and Ae. albopictus (25). There 
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were 3755 confirmed CHIKV cases in MG during 
January 2015 to October 2017. The CHIKV epi- 
demic in MG in 2017 began later and lasted longer 
than the YFV outbreak (Fig. 1C), consistent with 
the hypothesis that YFV and CHIKV in the re- 
gion are transmitted by different vector species. 
However, 29 municipalities with human YFV 
cases also reported CHIKV cases (Fig. 1D and 
fig. S3), indicating that YFV is indeed present in 
municipalities with Aedes mosquitoes. The mean 
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Fig. 3. Molecular phylogenetics of the Brazilian 
YFV epidemic. (A) Maximum likelihood 

phylogeny of complete YFV genomes showing the 
outbreak clade (red triangle) within the South 
American | (SA1) genotype (Fig. 4 and fig. S6). 
SA2, WAfr, and EAfr indicate the South America ll, 
West Africa, and East Africa genotypes, 
respectively. For clarity, five YFV strains introduced 
to Venezuela from Brazil (49) are not shown. The 
scale bar is in units of substitutions per site (s/s). Node 
labels indicate bootstrap support values. RO 2002, 


strain BeH655417 from Roraima; MG 2003, 
two strains from the previous YF outbreak in 
MG in 2003; 17DD, the vaccine strain used in 
Brazil; AO 2016, YFV outbreak Angola in 
2015-2016 (13). (B) Root-to-tip regression of 


sequence sampling date against genetic divergence 


from the root of the outbreak clade (fig. S6). 
Sequences are colored according to sampling 
location (MG, Minas Gerais; ES, Espirito Santo; 
RJ, Rio de Janeiro; BA, Bahia). (C) Violin plots 
showing estimated posterior distributions 
(white circles denote means) of the time of the 
most recent common ancestor (TMRCA) of 
the outbreak clade. Estimates were obtained 


using two different datasets (gray, SAl genotype; 
red, outbreak clade) and under different evolutionary 


models: a, uncorrelated lognormal relaxed 
clock (UCLN) model with a skygrid tree prior 
with covariates specifically, the time series 
data shown in Fig. 1, A to C; also see fig. S7); 
b, UCLN model with a skygrid tree prior 
without covariates; c, fixed local clock model 
(see supplementary materials). 


YFV vaccination rate in districts with both YFV 
and CHIKV cases is 72.6% (range = 61 to 78%) 
(4). Thus, relatively high vaccination rates in the 
locations in MG where YF spillover to humans 
occurs, and potentially lower vector competence 
(23, 24), may ameliorate the risk of establish- 
ment of an urban YFV cycle in the state. However, 
adjacent urban regions (including Sao Paulo and 
Rio de Janeiro) have lower vaccination rates (4), 
receive tens of millions of visitors per year (26), 
and have recently experienced many human YFV 
cases (20). Thus, the possibility of sustained urban 
YFV transmission in southern Brazil and beyond 
necessitates continual virological and epidemio- 
logical monitoring. 

We sought to establish a framework to evaluate 
routes of YFV transmission during an outbreak 
from the characteristics of infected individuals. 
Specifically, we assessed whether an outbreak 
is driven by sylvatic or urban transmission by 
comparing the age and sex distributions of ob- 
served YFV cases with those expected under an 
urban cycle in MG. For example, an individual’s 
risk of acquiring YFV via the sylvatic cycle de- 
pends on their likelihood to travel to forested 
areas, an occurence that is typically highest 
among male adults (27). In contrast, under an 
urban cycle, we expect more uniform exposure 
across age and sex classes, similar to that ob- 
served for urban cases in Paraguay (28) and 
Nigeria (29). 
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The male-to-female sex ratio of reported YFV 
cases in MG is 5.7 (85% of cases are male), and 
incidence is highest among males aged 40 to 49 
(Fig. 2). We compared this distribution to that 
expected under two models of urban cycle trans- 
mission (supplementary materials). In model M1, 
age and sex classes vary in vaccination status but 
are equally exposed to YFV, a scenario that is typ- 
ical of arboviral transmission (30). Under model 
M1, predicted cases are characterized by a sex 
ratio ~1, and incidence peaks among individuals 
aged 20 to 25 (Fig. 2). In model M2, we assume 
that the pattern of YFV exposure among age and 
sex classes follows that observed for CHIKV. The 
sex ratio of reported CHIKV cases in MG is 0.49 
(33% of cases are male) (fig. S4). Under model 
M2, predicted incidence is highest in females 
aged >30. The discrepancy between the observed 
distribution and that predicted under the two 
urban cycle models indicates that the YF epidem- 
ic in MG is dominated by sylvatic transmission. 
This method shows that age- and sex-structured 
epidemiological data can be used to qualitatively 
evaluate the mode of YFV transmission during 
an outbreak. 


Genomic surveillance of the Brazilian 
YFV outbreak 


During a YF outbreak, it is important to under- 
take virological surveillance to (i) track epidemic 
origins and transmission hotspots, (ii) character- 
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ize genetic diversity to aid molecular diagnostics, 
(iii) detect viral mutations associated with dis- 
ease severity, and (iv) exclude the possibility that 
human cases are caused by vaccine reversion. We 
generated 62 complete YF genomes from infected 
humans (n = 33) and NHPs (n = 29) from the 
most affected Brazilian states, including MG 
(n = 51), Espirito Santo (n = 8), Rio de Janeiro 
(n = 2), and Bahia (n = 1) (Fig. 3 and table S3). 
We also report two genomes from samples 
collected in 2003 during a previous YFV out- 
break in MG from 2002 to 2003 (37). Genomes 
were generated in Brazil using a combination of 
methods (tables S5 to $7); half were generated 
in MG using a MinION portable YFV sequenc- 
ing protocol adapted from (32) (tables S4 and 
85). This protocol was made publicly available 
in May 2017 after the completion of pilot sequenc- 
ing experiments using a cultured vaccine strain 
(supplementary materials). Median genome cov- 
erages were similar for samples obtained from 
NHPs [99%; median cycle threshold value (Ct) = 11] 
and from human cases (99%; median Ct = 16) 
(tables S5 to S7). 

To put the newly sequenced YFV genomes in a 
global context, we added our genomes to a pool 
of 61 publicly available genomes (33, 34). We 
developed and applied an automated online 
phylogenetic tool to identify and classify YFV 
gene sequences (also publicly available, see sup- 
plementary materials). Phylogenies estimated by 
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Fig. 4. Spatial and evolutionary dynamics of YFV outbreak. (A) Frequency 
of detection of YFV in NHPs in the Americas (50). Circle sizes represent the 
proportion of published studies (n = 15) that have detected YFV in each 
primate family and region. SA, South America (except Brazil); CA, Central 
America; CB, Caribbean; BR1, Brazil (before 2017); BR2, Brazil (this study). 
(B) Maximum clade credibility phylogeny inferred under a two-state (human 
and NHP) structured coalescent model. External node symbols denote 
sample type. Gray bars and labels indicate sample location (RJ, Rio de 
Janeiro; ES, Espirito Santo; BA, Bahia; others were sampled in MG). Internal 
nodes whose posterior state probabilities are >0.8 are annotated by circles. 
Node labels indicate posterior state probabilities for selected nodes. 

Internal branches are blue for NHPs and red for humans. Figure S8 shows a 
fully annotated tree. (C) Average number of YFV phylogenetic state 
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transitions (from NHPs to humans) per month. Solid line, median estimate; 
shaded area, 95% BCI. (D) Expansion of the YFV epidemic wavefront 
estimated using a continuous phylogeographic approach (35). At each time 
point the plot shows the maximum spatial distance between phylogeny 
branches and the inferred location of outbreak origin. Solid line, median 
estimate; shaded area, 95% BCI. The dashed lines in (B) to (D) indicate when 
YF was declared a public health emergency in MG (13 January 2017). 

(E) Reconstructed spatiotemporal diffusion of the YFV outbreak. 
Phylogeny branches are arranged in space according the locations of 
phylogeny nodes (circles). Locations of external nodes are known, whereas 
those of internal nodes are inferred (44). DF, Distrito Federal; GO, Goids; 
SP, Sao Paulo. Shaded regions represent 95% credible regions of internal 
nodes. Nodes and uncertainty regions are colored according to time. 
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this tool, along with maximum likelihood and 
Bayesian methods, consistently place the Brazilian 
outbreak strains in a single clade within the South 
America I (SA1) genotype with maximum statisti- 
cal support (bootstrap = 100%; posterior prob- 
ability > 0.99) (Fig. 3A and fig. $5). 

The outgroup to the outbreak clade is strain 
BeH655417, a human case sampled in Alto Alegre, 
Roraima, north Brazil, in 2002. In contrast, iso- 
lates sampled during the previous outbreak in 
MG in 2003 are more distantly related to the 
outbreak clade within the SA1 genotype (Fig. 3A). 
Thus, the 2017 outbreak was more likely caused 
by a YFV strain introduced from an endemic area, 
possibly northern or center-west Brazil (35), than 
by the reemergence of a lineage that had per- 
sisted in MG. Although low sampling densities 
mean that this conclusion is provisional, simi- 
lar scenarios have been suggested for previous 
Brazilian epizootics (36). The 14-year gap be- 
tween the current outbreak and the date of the 
most closely related nonoutbreak strain agrees 
with the reported periodicity of YF outbreaks in 
northern Brazil (37), thought to be dictated by 
vector abundance and the accumulation of sus- 
ceptible NHP hosts (19, 38). 

At least seven humans from MG with PCR- 
confirmed YFV received a YF vaccine before the 
onset of symptoms. To test that these occur- 
rences were caused by natural infection, and not 
by vaccine reactivation, we sequenced the YFV 
genomes from three of these cases (Fig. 3A and 
table S3). Our phylogenetic analysis clearly shows 
that these represent natural infections caused by 
the ongoing outbreak and are conclusively not 
derived from the 17DD vaccine strain (which be- 
longs to the West African YFV genotype) (Fig. 3A 
and fig. S6). 


Unifying YFV epidemiology 

and molecular evolution 

Virus genomes are a valuable source of informa- 
tion about epidemic dynamics (39) but are rarely 
used to investigate specific YFV outbreaks in de- 
tail. Here we show how a suite of three analytical 
approaches, which combine genetic, epidemio- 
logical, and spatial data, can provide insights into 
YFV transmission. 

First, we used a Bayesian method (40) to ex- 
plore potential covariates of fluctuations in the 
effective population size of the YFV outbreak in 
2017. After finding that genetic divergence in the 
outbreak clade accumulates over the time scale 
of sampling (Fig. 3B and fig. S6), albeit weakly, 
we sought to determine which epidemiological 
time series best describe trends in inferred YFV 
effective population size. We found that effective 
population size fluctuations of the YFV outbreak 
are well explained by the dynamics of both hu- 
man and NHP YFV cases (inclusion probability: 
0.37 for human cases and 0.63 for NHP cases) 
(table S8). These two YFV time series explain 
the genetic diversity dynamics of the ongoing 
outbreak 10° times more effectively than CHIKV 
incidence (inclusion probability <0.001), which 
represents transmission by Aedes spp. vectors. One 
benefit of this approach is that epidemiological 
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data contribute to estimation of the outbreak 
time scale. By incorporating YFV incidence data 
into evolutionary inference, we estimate the time 
of the most recent common ancestor (TMRCA) 
of the outbreak clade to be late July 2016 
[95% Bayesian credible interval (BCI): March 
to November 2016] (Fig. 3C and fig. S7), con- 
sistent with the date of the first PCR-confirmed 
case of YFV in a NHP in MG (Jul 2016). The 
uncertainty around the TMRCA estimate is re- 
duced by 30% when epidemiological and genomic 
data are combined, compared with genetic data 
alone (Fig. 3C). 

Second, to better understand YFV transmis- 
sion between humans and NHPs, we measured 
the movement of YFV lineages between the NHP 
reservoir and humans, using a phylogenetic 
structured coalescent model (47). Although pre- 
vious studies have confirmed that YFV is circu- 
lating in five neotropical NHP families (Aotidae, 
Atelidae, Callitrichidae, Pitheciidae, and Cebidae) 
(Fig. 4A), thus far NHP YFV genomes during 
the 2017 outbreak have been recovered only from 
Alouatta spp. (family Cebidae) (33). In this anal- 
ysis, we used the TMRCA estimate obtained 
above (Fig. 3C) to inform the phylogenetic time 
scale (Fig. 4B). All internal nodes in the outbreak 
phylogeny whose host state is well supported 
(posterior probability >0.8) are inferred to belong 
to the NHP population, consistent with an ab- 
sence of urban transmission and in agreement 
with the large number of NHP cases reported 
in southeast Brazil (20). Despite this, we cau- 
tion that hypotheses of human-to-human transmis- 
sion linkage should not be tested directly using 
phylogenetic data alone, owing to the large un- 
dersampling of NHP infections. Notably, the 
structured coalescent approach reveals sub- 
stantial changes in the frequency of NHP-to- 
human host transitions through time, rising 
from zero around November 2016 and peaking 
in February 2017 (Fig. 4C). This phylogenetic 
trend matches the time series of confirmed YFV 
cases in MG (Fig. 1, A and B), demonstrating that 
viral genomes, when analyzed using appropriate 
models, can be used to quantitatively track the 
dynamics of zoonosis during the course of an 
outbreak (42). 

Third, we used a phylogenetic relaxed random 
walk approach to measure the outbreak’s spa- 
tial spread (43) (supplementary materials and 
methods and table S9). When projected through 
space and time (Fig. 4, D and E, and movie S1), 
the phylogeny shows a southerly dissemination 
of virus lineages from their inferred origin in MG 
toward densely populated areas, including Rio 
de Janeiro and Sao Paulo (where YF vaccina- 
tion was not recommended until July 2017 and 
January 2018, respectively). We estimate that 
virus lineages move, on average, 4.25 km/day 
(95% BCI: 2.64 to 10.76 km/day) (44). This 
velocity is similar when human YFV terminal 
branches are removed (5.3 km/day) and there- 
fore most likely reflects YFV lineage movement 
within the sylvatic cycle and not the movement 
of asymptomatic infected humans. These rates 
are higher than expected given the distances 
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typically travelled by NHPs in the region (45) and 
suggest the possibility that YFV lineage move- 
ment may have been aided by human activity— 
e.g., transport of infected mosquitoes in vehicles 
(46) or hunting or illegal trade of NHPs in the 
Atlantic forest (47, 48). The epidemic wavefront 
(maximum distance of phylogeny branches from 
the inferred epidemic origin) expanded steadily 
between August 2016 and February 2017 at an 
estimated rate of ~3.3 km/day. Therefore, by the 
time YF was declared a public health emergency 
in MG (13 January 2017; dashed lines in Fig. 4, 
B to D), the epidemic had already expanded 
~600 km (Fig. 4D) and caused >100 reported 
cases in both humans and NHPs (Fig. 1). Notably, 
the first detection in humans in December 2016 
was concomitant with the outbreak’s spatial ex- 
pansion phase (Fig. 4D) and the rise in estimated 
NHP-to-human zoonoses (Fig. 4C); both were likely 
driven by an increase in the abundance of sylvatic 
vectors. Thus, the outbreak lineage appeared to 
circulate among NHPs in a widening geographic 
area for several months before human cases were 
detected. 


Conclusion 


Epidemiological and genomic surveillance of 
human and animal populations at risk is crucial 
for early detection and rapid containment of YFV 
transmission. The YFV epidemic in Brazil con- 
tinues to unfold with an increase in cases since 
December 2017. Longitudinal studies of NHPs 
are needed to understand how YFV lineages dis- 
seminate across South America between out- 
breaks and how epizootics are determined by 
the dynamics of susceptible animals in the re- 
servoir. To achieve the WHO’s goal to eliminate 
YF epidemics by 2026, YF surveillance necessi- 
tates a global, coordinated strategy. Our results 
and analyses show that rapid genomic surveil- 
lance of YFV, when integrated with epidemio- 
logical and spatial data, could help anticipate 
the risk of human YFV exposure through space 
and time and monitor the likelihood of sylvatic 
versus urban transmission. We hope that the tool- 
kit introduced here will prove useful in guiding 
YF control in a resource-efficient manner. 
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The spatial footprint of injection 
wells in a global compilation of 
induced earthquake sequences 


Thomas H. W. Goebel*} and Emily E. Brodskyt 


Fluid injection can cause extensive earthquake activity, sometimes at unexpectedly 
large distances. Appropriately mitigating associated seismic hazards requires a better 
understanding of the zone of influence of injection. We analyze spatial seismicity decay in 
a global dataset of 18 induced cases with clear association between isolated wells and 
earthquakes. We distinguish two populations. The first is characterized by near-well 
seismicity density plateaus and abrupt decay, dominated by square-root space-time 
migration and pressure diffusion. Injection at these sites occurs within the crystalline 
basement. The second population exhibits larger spatial footprints and magnitudes, 
as well as a power law-like, steady spatial decay over more than 10 kilometers, 
potentially caused by poroelastic effects. Far-reaching spatial effects during injection 
may increase event magnitudes and seismic hazard beyond expectations based on 


purely pressure-driven seismicity. 


uman-induced seismicity close to geo- 

thermal, hydraulic fracturing, and waste- 

water disposal wells is a source of substantial 

seismic hazard. Such injection activity has 

led to many moderate-magnitude earth- 
quakes and an exceptional increase in earth- 
quake rates in parts of North America and Central 
Europe after ~2006 (J-3). The hazard from 
injection-induced earthquakes is particularly 
difficult to manage because the earthquakes 
frequently occur at large distances (>10 km) from 
the targeted injection zones (4-6). 

A better understanding of the driving mech- 
anisms of injection-induced seismicity is vital 
for improving seismic hazard assessment and 
mitigation. Traditionally, such hazard assessment 
has concentrated on pore-pressure increase in 
a volume hydraulically connected to the injec- 
tion wells (7, 8). Pore-pressure increase, which 
we here call the direct pressure effect, is thought 
to reduce the normal load on locked faults, 
thereby allowing sliding to occur. 

As induced seismicity cases and studies have 
proliferated, the importance of additional mech- 
anisms such as elastic and fully coupled poro- 
elastic stresses have become increasingly clear 
(9-11). The pore-pressure increase in the injec- 
tion zone is expected to load the surrounding 
rock matrix and result in a fully coupled fluid- 
solid stress field. Elasticity is an effective means 
of transmitting forces to great distances, and 
therefore the fully coupled poroelastic stress 
field can extend well beyond the fluid pressure 
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increase in the hydraulically connected region. 
For example, large-scale, field-wide injection can 
perturb faults and induce earthquakes more than 
30 km away (11). In controlled injection exper- 
iments that resolve coupled aseismic and seis- 
mic processes during fluid injection, induced 
earthquakes are absent within the pressurized 
zone but occur as a result of elastic stress changes 
in the surrounding rock volume (72). These ob- 
servations require a reexamination of the con- 
trols on induced seismicity and suggest that 
the spatial reach of the earthquakes may be a 
useful discriminant of triggering mechanisms. 
The observations also suggest that induced earth- 
quakes may extend farther than previously thought 
[i.e., at least to 30 km from wells (4, 12], and 
empirically measuring the spatial extent is nec- 
essary to develop appropriate mitigation and 
regulatory approaches. 

Here we examine the distance of induced 
earthquakes from injection wells to understand 
the mechanical controls on the spatial extent of 
induced seismicity. We start by quantifying the 
shape of spatial decay and find two groups with 
distinct decay patterns. We then analyze migra- 
tory behavior within each group, as well as the 
relationship between operational parameters 
and the resultant spatial pattern of earthquakes. 
We examine differences in observed maximum 
magnitudes, which appear to be linked to the 
type of spatial pattern. Lastly, we discuss possi- 
ble physical mechanisms that control the distinct 
behavior within the two groups, with a focus on 
potential differences in poroelastic properties 
between sedimentary and basement units. 

We compiled measurements of the spatial 
decay of induced earthquakes from causative 
wells worldwide (13-23). We focused on isolated 
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injection sites with well-recorded induced se- 
quences, which are clearly connected to a single 
geothermal, scientific, or waste-fluid injection 
well (fig. S1; section S2 describes two excep- 
tions with more than one well). This strategy 
eliminates some major regions, such as Okla- 
homa, where effects of field-wide injection are 
difficult to unravel. All of the data used are in 
the public domain (24). The initial 27 selected 
injection sites are predominantly located in 
intraplate regions with low seismic activity 
throughout the United States, central Europe, 
and Australia (Fig. 1A). We included one well 
from the northwestern edge of the Geysers geo- 
thermal field because of its isolated location 
and several-kilometer distance to other active 
injection wells during the analyzed time period 
from 2007 to 2009 (16). 

Before determining the spatial seismicity decay 
from wells, we assessed the quality of the seis- 
mic record, focusing on spatial and magnitude 
information. We first selected events above the 
magnitude of completeness (/,), determined by 
minimizing the misfit between the observed dis- 
tribution and the Gutenberg-Richter relation 
(fig. S2). The datasets for Soultz-Sous-Foretz, 
France, and Fenton Hill, New Mexico, do not in- 
clude magnitude information. For the Soultz site, 
magnitude estimates are available from a surface 
array for a subset of events, showing consistent 
results between borehole and magnitude-corrected 
surface catalogs. For the Fenton-Hill site, the en- 
tire seismic record was used. 

In a second quality-control step, we tested 
whether the observed distributions exhibit signif- 
icant deviations from random Gaussian location 
uncertainty and show significant spatial cluster- 
ing close to wells at rates above the background 
activity, using a two-sided Kolmogorov-Smirnov 
test (fig. S4). The quality-control steps elimi- 
nated nine induced cases, leaving 18 sequences 
for further analysis. 

For the high-quality sequences, we computed 
two-dimensional (2D) distances between wells 
and earthquakes at the average depth of the 
injection interval, taking into account the well 
trajectories. Relative horizontal location un- 
certainties of seismic events are on the order 
of tens of meters, whereas absolute location 
uncertainty ranges from 100 to 500 m (section 
S4), and average vertical location errors are 
more than 1 km. Because the vertical uncer- 
tainties are large, we initially focused on 2D 
distances and later compared the results with 
3D distances. Seismicity distance fall-off from 
wells was calculated from areal densities de- 
termined by nearest-neighbor binning of the k 
closest events with a moving window of 4/2, 
resulting in density estimates, p = k/A7, where Ar is 
the distance between the first and Ath event. 

The seismicity density distributions can gen- 
erally be described by two types of spatial decay: 
(i) sequences with an extended plateau close to 
the well, followed by an abrupt decay within 
less than 1 km, and (ii) sequences with steady, 
power law-like decay out to distances of more 
than 10 km (Fig. 1B). The shapes of the two types 
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Fig. 1. The spatial decay of induced sequences can be classified 

as abrupt or steady. (A) Map of the injection sites (blue triangles) in the 
United States, Australia, and Europe (fig. S1). (B) Seismic density of 

all studied induced sequences, normalized by number of events above 
completeness. We show cases with abrupt decay in shades of blue and steady 
decay in shades of red. We shifted abrupt decays vertically (translucent 

blue dots show one example of the original vertical position; the blue arrow 


of spatial decay remain consistent across a range 
of spatial scales and durations of injection op- 
erations, spanning hours (e.g., Rustrel) to years 
(e.g., Paradox), both for the complete datasets 
and for the subsets of events from individual 
sites (fig. S8). The consistency in the shapes 
of spatial decay suggests an underlying time- 
invariant process. We distinguished the two 
types of distance fall-offs quantitatively by fit- 
ting with the functional form 


1 


[i f. (r/reyy 


P = Po (1) 


where p is seismic density, py describes the short- 
distance density plateau, 7, is the corner distance, 
and y is an exponent describing the abrupt- 
ness of the decay at larger distances. We chose 
this functional form to capture both differences 
in near-well density plateaus and the abrupt- 
ness of distance fall-offs. We determined the 
three free parameters by fixing po using the av- 
erage density of the first five sample points and 
then inverted for r, and y using a maximum 
likelihood approach that assumes Poissonian 
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uncertainty in each distance bin (24). The dis- 
tribution of the resulting decay exponents ex- 
hibits a natural break between y = 3.1 and 4.3, 
separating sequences into steady decay with 
y = 1.5 to 3.1 and abrupt decay with y = 4.3 to 
5.9 (fig. S5). The difference between these two 
populations exceeds the maximum expected un- 
certainty of +0.34 based on the 95% confidence 
intervals of the maximum likelihood fits. In 
addition to having larger y values than steady- 
decay sites, sites with abrupt decay are charac- 
terized by corner distances that are substantially 
closer to the maximum extent of the sequence 
(fig. S6). 

Several observations indicate that the result 
is not an artifact of specific instrumental, sta- 
tistical, or measurement practices: (i) The maxi- 
mum distance of the earthquakes is smaller than 
array aperture in all cases, so that the spatial de- 
cay is not a product of limited array extent, which 
may truncate the data. We further tested the 
influence of array geometry by computing M, 
as a function of distance from wells and found no 
systematic bias. (ii) Where a single site allows for 
comparison between instrumentation (i.e., surface 
versus borehole arrays), the results are identical. 
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shows the shift) to allow for easier visualization. (C) Theoretical expectation 
of spatial density fall-off using Eq. 2; the blue curve shows the abrupt 

decay of pressure-dominated sites, and the red curve shows a power law 
decay for sites with strong poroelastic coupling. (D) Merged densities above 
r. for sequences with steady decay (black markers) and power law fit 

(red dashed line) with ~r-t®, which is more gradual than spatial aftershock 
decay in California (gray markers and black dashed line). 


Specifically, the spatial decay at the Soultz site 
shows abrupt fall-offs in both the borehole and 
the wide-aperture surface array catalogs when 
corrected for M,. (iii) In addition to fitting Eq. 1, 
the two types of decay can be identified by using 
pure power laws. A power law is well fit over a 
large distance range for sites with steady decay, 
whereas abrupt-decay sites show power law-like 
behavior only over a limited range of distances 
that are near the maximum distance of the data- 
sets (fig. S6). (iv) Lastly, stacking seismicity den- 
sity estimates on the basis of 2D or 3D distance 
leads to a visible separation of steady and abrupt 
decay (Fig. 1B and fig. S11). 

To connect the spatial decay to perturbing 
stresses, we considered induced pore-pressure 
changes and poroelastic coupling. We expressed 
the observed seismicity fall-off within a Coulomb 
failure framework as the product of stress per- 
turbation and the number of faults sufficiently 
close to failure (25) 


Ng (2) 


oc A 
p(7) a 2nrAr 


where p is seismicity density, Ao is the induced 
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Fig. 2. Sequences with abrupt spatial decay are dominated by 
square-root migration, an indication of pressure diffusion. Examples 
of (A) linear and (B) square-root migration. Gray markers show event time and 


F) Abrupt, Below 
Basement 


Spatial Decay Exponent, y 


Basel 


500 0 


3000 —2500 -—2000 -—1500 -—1000 


Steady, Above 
Basement 


Austrel 


500 10° 10! 


Distance from Basement [m] 


Fig. 3. Above-basement injection commonly results in larger spatial 
footprints and a higher probability of inducing larger-magnitude earth- 
quakes. (A) The spatial decay, separated into abrupt (blue) and steady 
(orange), is controlled by the distance between injection and crystalline 
basement. (B) Maximum magnitude of each sequence as a function of total 


stress perturbation, and Ng/(2nrAr) is the density 
of faults per area. To build intuition for the in- 
fluence of changes in Ao, we first assumed that 
the fault density is constant and investigated 
the distance fall-off expected for the direct pres- 
sure effect and poroelastic coupling. For the 
direct pressure case, we can write the distance 
fall-off of the first term in Eq. 2 in a vertically 
confined reservoir as Ao = AP, ~ Wu), for which 
AP, is pore pressure, W(w) is the exponential 
integral, and the argument u = 7°/(4D8); r is dis- 
tance, t is time from injection, and D is hydrau- 
lic diffusivity (26). The corresponding distance 
fall-off is substantially faster than poroelastic 
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stress decay in the far field of injection, which 
decays as 1/r” (26). The expected distance decay 
based on pure pressure and poroelastic models 
approximately matches the shapes of observed 
density fall-offs for abrupt and steady decays 
(Fig. 1, B and C). 

Although the similarity between the simple 
models and the observed spatial decay is com- 
pelling, additional processes are expected to 
contribute. Such processes include the coupling 
between pressure and permeability, as well as 
heterogeneity in the permeability structure in 
the presence of faults (8, 14, 27). In addition, 
elastic stress transfer and event-event inter- 


31 August 2018 


Fenton Hill '83 
-- linear 


Time from injection [day] 


distance from injection; white markers show the 95th percentile of distances in 
specific time bins. (C) Histogram of the number of abrupt- and steady-decay 
sites with square-root (blue), linear (orange), or no (gray) migration. 


a 


St. Gallen <7 s 
Oo - 7 OBasel 3 
a 3¢ 
_-” Paralana olandau Geysersp ~ D> 
Habanero ee ° s 
° lewberry 2 
Newberry14 Naa E 
Fenton Hill enton Hill 86 = 
KTB2009, Bo 1S 
w 
= 
0 
= 
—2 
—3 


10? 108 104 10° 108 10’ 


Totoal Fluid Volume [m°] 


injected volume for steady (orange) and abrupt (blue) decays. The dashed 
line is the theoretically expected maximum moment based on (35), for which 
G is shear modulus and V is total injected volume. (For Fenton Hill, we report 
the largest recorded magnitude during all stimulations. No stimulation-specific 
fluid volume is available for Raft River.) 


action of both seismic and aseismic ruptures 
can increase the extent of induced seismicity 
sequences (12). 

We further analyzed sequences with extended, 
power law-like decay by fitting the joined den- 
sities above corner distances (r,) of individual 
sequences (Fig. 1D). The corresponding linear 
least-square fit of the log-transformed data has a 
value of y = 1.8. This value is substantially smaller 
(i.e., more gradual) than the spatial decay of after- 
shocks from mainshock epicenters in California, 
for which y = 2.4 (25) (Fig. 1D). The latter value 
was determined by identifying mainshock- 
aftershock clusters by using a nearest-neighbor 
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Fig. 4. The probability of inducing an earthquake at distance r is controlled by fault availability 
and amplitude of stress perturbation. (A) Schematic representation of injection operation, 

footprint of poroelastic response (blue and red ellipses), and fault network (gray lines). (B) Earthquake 
probability (in events per area) as a function of distance from the injection well for pressure-dominated 
triggering. (C) Same as (B) in a coupled system with elastic stress dominance in the far field. [Both 


the x and y axes in (B) and (C) are logarithmic. ] 


distance in a space-time-magnitude domain (28). 
This difference in y exponents indicates that 
spatial seismicity decay contains information 
on the distinctive forcing stresses specific to 
injection-induced sequences. The stresses from a 
mainshock that produce aftershocks cannot 
explain the data, so we must invoke additional 
processes such as fluid migration and poroelas- 
ticity. The smaller y value for induced sequences 
points to a more gradual spatial relaxation of 
the underlying stress field. 

To test the influence of location uncertainty 
on the shape and extent of spatial decay, we 
performed Monte Carlo simulations of random 
power law-distributed data convolved with ran- 
dom normal distributions with standard devia- 
tions that correspond to the observed location 
uncertainty (section S4). Our simulations reveal 
that decay exponents can in principle be inflated 
owing to location uncertainty. However, our loca- 
tion uncertainties are small enough that the 
effect is negligible for this study (fig. S9). We can 
robustly identify sites with abrupt decay in the 
2D data because both relative and absolute un- 
certainties are smaller than the determined r, 
values (fig. S10). 

We next investigated potentially systematic 
migratory behavior of seismicity in each sequence. 
Migration is particularly interesting because the 
most commonly invoked mechanism for induced 
seismicity—direct pore-pressure diffusion from 
a nearly instantaneous injection—has a well- 
understood square-root dependence of distance 
on time (9). Other cases worth considering are 
no migration, as might be expected for a rapidly 
applied elastic stress perturbation, and linear 
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migration. These three models were fitted to the 
seismicity envelope using a maximum-likelihood 
method, and the preferred model was chosen 
on the basis of a Bayesian information criterion 
(Fig. 2 and section S3). 

Square-root migration is most common at 
abruptly decaying sites, consistent with direct 
pore-pressure effects (Fig. 2). Cases with steady 
decay, on the other hand, show linear migration 
(five sites; Fig. 2C) or a lack of migration (six 
sites). The latter may be characteristic for poro- 
elastic stress-dominated sequences during which 
earthquakes are triggered beyond the pressure- 
dominated region even shortly after injection 
starts, leading to a breakdown of square-root 
migration (1). Processes that may contribute to 
linear migration are related to elastic stress 
transfer, including event-event triggering, slow 
slip, and aseismic slip (72, 18, 29). Additional 
mechanisms that likely affect seismicity migra- 
tion include fracture creation, changes in perme- 
ability structure, and thermally induced stresses. 
These mechanisms are expected to be most pro- 
nounced during geothermal injection and close 
to injection wells (16). 

Distance decay and space-time migration 
highlight two distinct groups of induced seis- 
micity sequences, consistent with direct pres- 
sure effects on the one hand and far-field elastic 
stresses on the other hand. To better under- 
stand the physical controls on this distinct be- 
havior, we investigated operational and geologic 
parameters at all injection sites. We evaluated 
the ability of injection volume, average flow rate, 
peak well head pressure, injectivity (which is 
approximated by flow rate divided by well head 
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pressure), injection depth, distance to basement 
(i.e., vertical distance to the crystalline basement 
rock), reservoir temperature, stress state, and 
injection duration to discriminate the two pop- 
ulations defined on the basis of spatial decay 
(fig. S12) (24). The most important governing 
parameter for the separation of abrupt and steady 
decay is distance to basement: All sites with 
abrupt decay were injecting below the upper 
basement surface, and sites with steady decay 
(except for Basel) were located in sedimentary 
rocks above the basement (Fig. 3). 

This difference in behavior based on lithology 
at injection depth can potentially be explained 
by a difference in the ability of the rocks to 
transmit fluid stresses into the solid material. 
The Biot-Willis o coefficient (26) captures the 
poroelastic coupling so that larger a values 
correspond to more effective coupling 

K 

a=1-— K (3) 
where K is the bulk modulus and K, is the 
modulus of the solid, so that a is expected to be 
close to 0 for stiff, low-porosity rocks in the base- 
ment and close to 1 for soft sediments (30). Ex- 
perimental results confirm that bulk strains in 
low-porosity rocks are less sensitive to fluid pres- 
sure changes (37). In other words, pressure per- 
turbations within the larger, interconnected pore 
space in sedimentary rocks are much more effi- 
cient in changing bulk stresses than pressure 
changes in smaller, isolated pores in crystalline 
rock. Sedimentary units may also have higher 
permeability, which influences the spatial extent 
of injection but is less consequential for the ob- 
served separation into steady and abrupt decay. 

A notable exception to the lithology-controlled 
seismic response is the Basel injection site. At 
this site, steady spatial decay of earthquakes up 
to local magnitude (M;) 3.4 was triggered by in- 
jection into the crystalline basement. A potential 
explanation for this observation is injection di- 
rectly into a highly fractured fault damage zone, 
as indicated by seismicity and focal mechanism 
analysis (13). Such damage zones likely exhibit 
higher o values, even within the crystalline base- 
ment, but will rarely be encountered unless spe- 
cifically targeted during injection because of their 
hydrogeologic properties. The 1967 Rocky Mountain 
Arsenal sequence is arguably in a similar category 
given the reported extent of the seismicity, but it 
is not included in this analysis because of the 
data quality issues discussed earlier. 

In addition to differences in local geology, 
we also observed variations in maximum event 
magnitudes between the two populations of in- 
duced earthquakes. Previous studies showed max- 
imum magnitudes above M4 for earthquakes 
induced by hydraulic fracturing and above M5 
for earthquakes induced by geothermal opera- 
tions or fluid disposal (1, 2, 15, 32). Peak mag- 
nitudes for a given injection volume are generally 
larger at sites with steady decay, which also 
produced the largest event in our dataset, with 
a magnitude of M;, 4.7 (Fig. 3B). The difference 
in maximum magnitudes is expected if more 
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extensive spatial footprints at sites with steady 
decay increase the probability of encounter- 
ing larger faults (33). Even for injection into a 
single well, poroelastic stresses may promote the 
activation of such distant, large faults that are 
close to failure without requiring a direct hydrau- 
lic connection. This effect is further enhanced by 
closely spaced, high-rate injection wells, which 
act as a finite source with stress decay out to ~2 
to 3 source dimensions (J7). 

The spatial decay of induced sequences can 
be expressed by the product of stress perturba- 
tion and available prestressed faults that fail 
because of this perturbation (Fig. 4). At sites 
with high poroelastic coupling, we expect elastic 
stresses to dominate earthquake triggering in 
the far field of injection, showing a scale-invariant 
spatial decay of ~7°, whereas direct pressure 
effects dominate close to injection wells. If we 
assume that the term for fault availability in 
Eq. 2 is expressed by Nz) /(2nrAr) o¢ r4—P, where 
d; is the fractal dimension of the fault network 
and D is the geometric dimension of the density 
measurement (here, D = 2), we find that d; = 2.2. 
Thus, fault density increases gradually as r°? 
with distance, so that for pressure-dominated 
sequences, the expected point of peak seismicity 
occurs at some distance from the well (Fig. 4B). 
The model proposed here captures large-scale, 
average trends in seismicity decay with distance 
from injection wells. Nevertheless, additional pro- 
cesses such as coupling between changes in pore 
pressure and permeability, as well as variations 
in regional fault network geometry, may also 
contribute to the functional form of the decay. 

Our global compilation of fluid injection- 
induced seismicity allows for a better under- 
standing of the maximum earthquake-triggering 
distance from an injection site. We can de- 
scribe the shape of spatial seismicity decay as 
either abrupt or steady. Steady decay shows 
power law-like behavior to distances of more 
than 10 km and is more gradual than spatial 
aftershock decay. Sites with abrupt decay are 
dominated by square-root space-time migration, 
which is consistent with pressure diffusion. 
Abrupt decay is limited to sites where injection 
is within the crystalline basement, whereas steady 
decay primarily occurs above the basement. The 
maximum magnitude is larger for sites with 
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steady decay owing to the greater probability 
of activating bigger faults within the extended 
spatial footprint of the injection wells. 

Previous strategies to mitigate induced seis- 
micity encouraged injecting in sedimentary units 
instead of directly into the basement (34). How- 
ever, our results suggest that injection into sed- 
imentary rocks leads to more distant and larger 
earthquakes for a given volume of injection, per- 
haps owing to more efficient pressure and stress 
transmission. The larger spatial footprints of 
above-basement injection may be responsible 
for the extensive seismogenic response in some 
areas, such as Alberta and Oklahoma (J, 4). 
Nevertheless, injection into the basement also 
poses a source of seismic hazard. Large earth- 
quakes, such as the 1967 Rocky Mountain Arsenal 
M,, 5.3 (8) or the recent South Korean M,, 5.4 
events (32), can occur if fluid is directly injected 
into a basement fault that is either specifically 
targeted or encountered by chance. The key result 
from this analysis is that injection in sedimentary 
units immediately overlying basement rocks 
is more likely to encounter a large fault by 
chance because of the larger spatial footprint. 
Far-reaching poroelastic effects complicate the 
assessment of distance-based induced seismic 
hazard and should be included in mitigation 
strategies. 
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SOLAR CELLS 


High-performance perovskite/ 
Cu(In,Ga)Se, monolithic 


tandem solar cells 
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En-Ping Yao’, Sheng-Yung Chang", Sang-Hoon Bae’, Takuya Kato’, 
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The combination of hybrid perovskite and Cu(In,Ga)Sez (CIGS) has the potential for 
realizing high-efficiency thin-film tandem solar cells because of the complementary 
tunable bandgaps and excellent photovoltaic properties of these materials. In tandem 
solar device architectures, the interconnecting layer plays a critical role in determining 
the overall cell performance, requiring both an effective electrical connection and 

high optical transparency. We used nanoscale interface engineering of the CIGS surface 
and a heavily doped poly[bis(4-phenyl)(2,4,6-trimethylphenyl)amine] (PTAA) hole 
transport layer between the subcells that preserves open-circuit voltage and enhances 
both the fill factor and short-circuit current. A monolithic perovskite/CIGS tandem 
solar cell achieved a 22.43% efficiency, and unencapsulated devices under ambient 
conditions maintained 88% of their initial efficiency after 500 hours of aging 


under continuous 1-sun illumination. 


onstructing a tandem solar cell with min- 
imal thermalization losses has proved 
to be a successful approach for overcom- 
ing the Shockley-Queisser limit of a single- 
junction cell. This device can realize the 
superposition of the open-circuit voltage (Voc) 
of both subcells while simultaneously preserv- 
ing high short-circuit current (Jsgc) by using 
photoactive materials with complementary ab- 
sorption characteristics to harvest a broader 
solar spectrum (J-9). Thin-film photovoltaic 
(PV) technologies applying various inorganic 
and organic photoactive materials have attracted 
considerable attention (10-13). Perovskite com- 
pounds debuted as an efficient light harvester 
for photoelectrochemical cells (/4) and later 
evolved from liquid to solid-state junctions that 
enabled a large boost in performance (15, 16), 
reaching power conversion efficiencies (PCEs) 
>22% in just 5 years (17-21). However, the well- 
established Cu(In,Ga)Se, (CIGS) solar cells have 
also yielded a maximum PCE >22% (22, 23). 
Both PV materials have widely tunable band- 
gaps, from 1.0 to 1.7 eV for CIGS and 1.2 to 2.3 eV 
for perovskite (24-29). These characteristics 
provide the capability to achieve the highest ef- 
ficiency of double-junction tandem solar cells, 
where the ideal rear and front cells should 
have bandgaps of 1.1 and 1.7 eV, respectively 
(30, 31). 
Several studies on four-terminal mechanically 
stacked perovskite/CIGS tandem solar cells have 
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been reported (32-34). The highest efficiency 
for this type of perovskite/CIGS architecture 
is 22.1%, obtained with a 19% PCE CIGS rear 
cell and a 16% PCE perovskite front cell (35). By 
contrast, few studies on two-terminal perovskite/ 
CIGS tandem solar cells have been reported, even 
though a two-terminal monolithic tandem archi- 
tecture is potentially preferable for industrial 
applications because of the reduced number of 
electrodes and transparent conducting layers 
necessary. In 2015, Todorov et al. reported a two- 
terminal perovskite/CIGS tandem solar cell with 
an efficiency of 10.9%, which was much lower 
than the performance of the individual CIGS or 
perovskite subcells (36). There are three main 
reasons for this inferior efficiency. First, optical 
losses can be caused by top opaque metal elec- 
trodes. Second, the intrinsic ZnO (i-ZnO) and 
aluminum-doped ZnO (AZO) layers of typical 
CIGS cells were removed as zinc oxides can 
cause deterioration of the perovskite layer. How- 
ever, by doing so, the original CIGS device 
architecture was compromised and elimination 
of the ZnO layer would inevitably degrade the 
CIGS device performance. Third, the fill factor 
(FF) was reduced to 60% because of a high series 
resistance (R,) caused by poor contact between 
the two subcells. 

The smoothness of the interconnecting layer 
(ICL) is equally crucial to create a reliable con- 
tact between the two subcells, because the planar 
perovskite solar cell is composed of several func- 
tional layers, with thicknesses from a few tens to 
hundreds of nanometers, that are sensitive to 
substrate roughness. Thus, in order to take advan- 
tage of these two technologies for two-terminal 
tandem solar cells, the challenge is to ensure 
the integrality of the two subcells, which relies 
heavily on the transparent top electrode of the 


perovskite front cell, maintaining the integrity 
of the original CIGS device structure to preserve 
its superior efficiency, and a well-designed ICL 
with a smooth surface. 

We developed a transport top electrode, suit- 
able ICL, and hole-transporting layer (HTL) for our 
tandem device and present a high-performance 
monolithic perovskite/CIGS tandem solar cell 
without modification of the CIGS device struc- 
ture, i.e., preserving its TCO layers (i-ZnO and 
boron-doped ZnO (BZO) layers). For the two sub- 
cells, we applied a semitransparent perovskite 
with a bandgap of 1.59 eV as the front cell, and 
CIGS with a bandgap of 1.00 eV as the rear cell. 
The certified tandem device achieves a PCE of 
22.43%. 

To design a functional ICL, the CIGS device 
surface must be taken carefully into considera- 
tion. In this study, BZO is used as the top layer of 
the CIGS device, which has a surface roughness 
of about 60 nm, and the maximum vertical dis- 
tance (VD) of the natural BZO layer texture can 
reach more than 250 nm (Fig. 1A). We speculate 
that such considerable roughness and VD may 
originate from the difference between peaks and 
valleys of the CIGS absorber layer. In addition, 
inhomogeneous nucleation of the bottom CdS 
buffer layer can also enhance BZO roughness, as 
shown in fig. S1. 

The maximum VD is comparable to the length 
of the perovskite absorber layer, which is usu- 
ally between 300 and 600 nm, and even larger 
than the thickness of the perovskite charge- 
transporting layers. With these large VDs, it 
becomes challenging to stack the perovskite 
solar cell on top of the CIGS with a homogeneous 
layer-by-layer structure. The rough BZO surface 
would cause perovskite subcell failure as the BZO 
peaks and rods can easily entangle the functional 
layers in the perovskite device to induce electri- 
cal shorting pathways between the top contact of 
perovskite subcell and the BZO layer. We con- 
firmed that the nature of the CIGS device surface 
is problematic for building a smooth ICL on top 
of it, and hence the ICL roughness is pivotal in 
realizing high-performance perovskite/CIGS tan- 
dem solar cells. 

To address this issue, we first deposited an 
indium tin oxide (ITO) layer, followed by chem- 
ical mechanical polishing (CMP) to smooth out 
the ITO surface. The addition of a sufficiently 
thick ITO layer can serve as a buffer layer for 
the CMP process to level out the huge VD of the 
BZO layer. The ITO layer was polished with a 
commercialized SiO, slurry. The detailed param- 
eters for CMP processing are provided in the sup- 
plementary materials (37, 38). After polishing, the 
maximum VD of the ITO layers was reduced to 
40 nm (Fig. 1B), which rendered the ITO surface 
smooth enough for subsequent fabrication of 
the functional perovskite front cell. Notably, the 
CMP process did not polish the BZO layer such 
that we retained the original CIGS solar cell 
structure (Fig. 1C). Furthermore, the BZO work 
function (—4.0 eV, fig. S2) was lower than that of 
poly[bis(4-phenyl)(2,4,6-trimethylphenyl)amine] 
(PTAA) (—5.1 eV, the HTL of perovskite subcell), 
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Fig. 1. Effects of CMP on CIGS surface and resulting performance of 
CIGS solar cells. (A) Atomic force microscopy (AFM) image of the CIGS 
surface before CMP polishing. (B) AFM image of the CIGS surface after 
CMP polishing. (©) Cross-sectional SEM images of the CMP processing on 
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Fig. 2. Performance of the semitransparent Cso99FAg77MAgi4Pb(lo.g6Broia)3 
perovskite solar cell. (A) Schematic of the semitransparent single junction 
perovskite solar cell. (B) Transmittance spectrum through the entire device 
stack. (C) J-V curves of the perovskite solar cell using different thicknesses 
of 1 wt % F4-TCNQ-doped PTAA with illumination through the MgF> side. 
(D) J-V curves of the perovskite solar cell using different thicknesses of 10 wt % 
TPFB-doped PTAA with illumination through the MgFo side. (E) J-V curves 
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the CIGS surface. (D) J-V curves of original CIGS solar cells and after CMP 
polishing with a step size of 0.02 V and a scan velocity of 0.1 V/s, 
measured under AML.5G illumination. (E) EQE of original CIGS solar cells 
and after CMP polishing. 
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in the forward (—O.1 to 1.2 V) and reverse (1.2 to -0.1 V) scan of the perovskite 
solar cell using 10 wt % TPFB-doped PTAA with illumination through the 
MgFo side. (F) EQE spectrum of the perovskite solar cell using 10 wt % 
TPFB-doped PTAA. The Jsc calculated from the EQE curve is 18.062 mA/cm?. 
(G) Photoluminescence of the perovskite layer on top of glass and PTAA doped 
with 1 wt % F4-TCNQ or 10 wt % TPFB. (H) TRPL data for the perovskite layer in 
contact with glass and PTAA doped with 1 wt % F4-TCNQ or 10 wt % TPFB. 
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Fig. 3. Performance of the perovskite/CIGS tandem cells. 

(A) Schematic and cross-sectional SEM image of the monolithic D 
perovskite/CIGS tandem solar cell. (B) J-V curve (NREL-certified; see 

fig. S8) and efficiency at the maximum power point (inset) of the champion 
tandem device. (©) EQE spectra for the subcells of the monolithic 
perovskite/CIGS tandem solar cell. (D) Stability test of the monolithic 
perovskite/CIGS tandem solar cell. The unencapsulation device maintained 
88% of their initial PCE after 500 hours of aging under continuous 

1-sun illumination and maximum power point tracking at 30°C ambient 
environment. The inset shows that the device can recover 93% of its initial 
performance after a 12-hour resting period without load and illumination. 


which causes a large contact potential barrier. 
This ITO layer can efficiently modify the surface 
work function to create a better ohmic contact 
for hole transportation. On the basis of our ex- 
periments, a 300-nm ITO layer proved sufficient 
to carry out the CMP process and fully cap the 
BZO peaks and rods. 

The current density-voltage (J-V) curves of the 
stand-alone original and polished CIGS solar cells 
are compared in Fig. 1D, and the data are shown 
in Table 1. The Voc remained constant, which fur- 
ther implies that CMP processing did not damage 
the CIGS device structure. The Jgc decreased from 
37.10 to 34.34 mA/cm” for the ITO-polished de- 
vice. We attributed the Jgc drop to the additional 
light absorption by the ITO layer, as we observed 
with the smaller external quantum efficiency 
(EQE) intensity across the entire response region 
(Fig. 1E). The EQE of the ITO-polished device was 
lower than that of the original CIGS device at 
wavelengths from 400 to 500 nm. This response 
region corresponds to the smaller bandgap of 
ITO compared to BZO and provides evidence 
that the ITO layer absorbed a fraction of the 
incident light. However, the ITO absorption in 
the short-wavelength region was negligible as 
the CIGS subcell is designed as a rear cell in the 
tandem device structure. After polishing, the FF 
decreased from 74.7 to 72.4%. The FF reduction 
was mainly induced by the R, increase induced 
by mediocre ITO lateral conductivity rather than 
the shunt resistance decrease (Fig. 1D). However, 
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because this ITO layer is used as the ICL for the 
tandem solar cell, the lateral conductivity will 
not affect the charge carrier transportation be- 
tween the front and rear subcells. After polishing 
the ITO layer of the CIGS device, the PCE was re- 
duced from 18.73 to 16.76%. If we exclude the FF 
deficit, the current loss was only 1.386 mA/cm? 
in the wavelength region from 750 to 1250 nm. 

We fabricated a semitransparent perovskite 
solar cell with an inverted structure (i.e., p-i-n) 
(Fig. 2A). Instead of using a metal electrode, we 
used a 100-nm ITO layer as the top contact in 
this structure to allow for sufficient light trans- 
mission (39). Various bandgaps of perovskites 
were tried in order to achieve current matching 
between the two subcells in the monolithic cell, 
and the best performance was achieved with a 
composition of Cs9 99FAo.7MAo.14P bo. s6Bro.14)3, 
which had a bandgap of 1.59 eV from ultraviolet- 
visible (UV-vis) measurements (fig. S4). Accord- 
ing to the optical simulation results (fig. S6), a 
600-nm perovskite layer is needed to provide 
adequate current density to match the CIGS 
rear-cell current density. The average transmit- 
tance of the semitransparent perovskite cell 
in the wavelength region between 770 and 
1300 nm is >80% (Fig. 2B), allowing most of 
the long-wavelength light to be absorbed by 
the CIGS rear cell. The transmittance gradually 
decreased from 770 to 550 nm, and the light 
was fully absorbed by the perovskite cell below 
550 nm. 


g 
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The thickness and coverage of the first layer 
on top of the ICL play a critical role for tandem 
device performance because the planar perov- 
skite device structure has a limited tolerance to 
the substrate roughness. Given that the polished 
ICL still preserved about 40 nm for its VD, we 
studied the solar cell performance versus the 
HTL thickness by using PTAA as the HTL and two 
different molecules, 2,3,5,6-tetrafluoro-7,7,8,8- 
tetracyanoquinodimethane (F4-TCNQ) and 
4-isopropyl-4'-methyldiphenyliodonium tetrakis 
(pentafluorophenyl)borate (TPFB), as dopants 
to enhance the HTL conductivity. We deposited 
PTAA with a low-temperature annealing process 
(110°C) to avoid damage to the rear CIGS solar 
cell (40-42). The J-V curves of semitransparent 
devices using F4-TCNQ and TPFB with differ- 
ent thicknesses measured under 100-mW/cm” 
illuminations are shown in Fig. 2, C and D, re- 
spectively, and the corresponding device param- 
eters are summarized in Table 2. 

With the same concentration of F4-TCNQ, the 
device performance obtained by applying 30 nm 
PTAA is similar to that achieved with 20 nm 
PTAA; however, the R, increased when the PTAA 
reached 40 nm, which led to a FF and PCE reduc- 
tion. The best device performance was achieved by 
using 20 nm F4-TCNQ-doped PTAA that gave a 
Voc of 1.084-V, Jsc of 18.10 mA/cm”, and FF of 75.6%, 
leading to an overall device efficiency of 14.83%. The 
desired F4-TCNQ/PTAA ratio was 1 weight % (wt %), 
and increasing the F4-TCNQ concentration did not 
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Table 1. Performance of CIGS solar cells before and after CMP polishing. 


Device configuration 


Voc (V) 


Jsc (mA/cm?) FF (%) PCE (%) 


Table 2. Performance of perovskite solar cells using different dopants. 


HTL thickness 


Device configuration (ian) 


Perovskite solar cell using 
F4-TCNQ—doped PTAA as HTL «= 


Perovskite solar cell using 
TPFB-doped PTAA as HTL 


50 (forward scan) 


decrease the R, because F4-TCNQ aggregates 
within the PTAA film. The J-V curves of the de- 
vices with different F4-TCNQ doping levels in the 
30-nm PTAA layers are presented in fig. S7, and 
their device data are summarized in table S1. 

The J-V curves of the perovskite device using 
TPFB-doped PTAA (TPFB/PTAA = 10 wt %) were 
less sensitive to HTL thickness. The best device 
had 50-nm PTAA, with a Voc of 1.091 V, Jgc of 
18.15 mA/cm”, FF of 75.5%, and PCE of 14.95%. 
The high FF of this device resulted from the R, 
reduction from the HTL, in which the TPFB- 
doped PTAA has a higher conductivity than the 
F4-TCNQ-doped PTAA. 

The semitransparent perovskite device using 
TPFB-doped PTAA was scanned from positive to 
negative (reverse scan) and negative to positive 
(forward scan) voltages with a step size of 20 mV 
and a delay time of 0.2 s for each data point in 
the J-V measurement (Fig. 2E). The photocurrent 
hysteresis is negligible as the perovskite grain 
boundaries are well passivated by [6,6]-phenyl- 
C61-butyric acid methyl ester (PCBM), which 
agrees with other reported results (43, 44). EQE 
data (Fig. 2F) for the semitransparent perovskite 
cell show an offset position at 780 nm, which was 
consistent with the UV-vis results. The integrated 
Jsc from the EQE using the AM 1.5 reference 
spectra reached 18.062 mA/cm?. 

Device performance is closely related to charge 
carrier dynamics in perovskite solar cells. We 
analyzed the charge collection and transporta- 
tion by using steady-state photoluminescence 
(PL) and time-resolved PL (TRPL). Figure 2G 
shows the steady-state PL spectroscopy of pe- 
rovskite films on three different substrates (glass, 
ITO/F4-TCNQ-doped PTAA, and TPFB-doped 
PTAA). Quenching was observed on both types 
of PTAA compared to the perovskite layers on 
glass, indicative of efficient charge transfer 
from the photoactive layer to the transport 
layer on contact with these two types of PTAA. 
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Voc (V) Jsc (mA/cm?2) FF (%) PCE (%) 


18.10 75.6 14.83 


18.12 745 14.66 


1.092 18.14 753 14.92 


From the TRPL responses, a decrease in the 
PL lifetime from 335 ns to 84 and 78 ns in the 
presence of 30-nm F4-TCNQ and 50-nm TPFB- 
doped PTAA, respectively, indicated that char- 
ge carriers within the perovskite layer were 
extracted effectively by these two types of 
PTAA. 

Figure 3A illustrates the schematics and cross- 
sectional scanning electron micrograph (SEM) 
images of the tandem devices studied here. The 
polished ITO layer was used as the ICL to bridge 
two subcells together without the need for a tun- 
neling junction. The J-V curve for the perovskite/ 
CIGS champion tandem solar cell with 0.042 cm? 
is shown in Fig. 3B, certified by the National 
Renewable Energy Laboratory (NREL). The 
tandem cell exhibited a Voc of 1.774 V (equal to 
the sum of the stand-alone Voc of subcells), a FF 
of up to 73.1%, and a Jgc of 17.3 mA/cm?, 
leading to an overall device PCE of 22.43%. We 
observed negligible hysteresis (fig. S9). In Fig. 
3C, the integrated Jgc from the EQE curves for 
the top and rear cells is 18.20 and 17.76 mA/cm?, 
respectively, showing well the current-matched 
subcells, although the rear solar cell slightly 
limited the overall tandem solar cell current. 
The EQE of the ITO-polished CIGS is lower 
than 80% from 800 to 1100 nm in Fig. 1E, but it 
can be restored to ~85% by applying a MgF, 
layer (Fig. 3C). This improvement can minimize 
the efficiency loss of the polished CIGS. Tandem 
devices with larger area (0.52 cm?) were also 
made, and the best one with negligible hysteresis 
had a 20.8% PCE measured in-house (fig. S11). 
The device parameters are summarized in table 
82. To realize the full potential of perovskite/ 
CIGS tandem devices, we suggest three key 
improvements. First, reducing the Voc loss in 
the perovskite subcell is required. An effective 
defect passivation can help to provide a higher 
Voc (45, 46). Second, using vapor-based pro- 
cesses to deposit the electron transport layer is 


more desirable than conventional solution pro- 
cess to prevent shunt pathways and improve 
interface contacts (39). Lastly, use of a higher 
PCE for the CIGS subcell is needed, especially 
considering that the limiting current could be in- 
creased by tailoring the CIGS bandgap in our case. 

In addition to a high PCE, long-term stability is 
another crucial benchmark for industrialization 
of perovskite solar cells. Several aging routines 
have been suggested to estimate a conclusive 
stability in which the ion migration effects are 
excluded (47, 48). We monitored the unencap- 
sulated tandem device performance by aging for 
500 hours under continuous 1-sun illumination 
and maximum power point tracking at 30°C am- 
bient environment. The device started with 22.0% 
PCE and retained >88% of its initial efficiency 
after aging, and it recovered 93% of its initial PCE 
after being kept in the dark for 12 hours without 
load (shown in Fig. 3D). We believe that the top 
transparent metal oxide layers (composed of 
ZnO nanoparticles and sputtered ITO) can ef- 
fectively resist moisture ingress (39, 49), so that 
this structure can help the perovskite compounds 
remain stable without severe degradation. 

Our approach can also help to alleviate the 
environmental impact of cadmium and extend 
the working period of the perovskite/CIGS tandem 
solar cell. The CIGS solar cells can be reused after 
washing out the degraded perovskite front cell. 
Because here we apply PTAA as the HTL ma- 
terial, the whole perovskite front cell can be 
removed from the CIGS rear cell by dissolving in 
chlorobenzene and N,N-dimethylformamide. 
Details of the washing process are described 
in the supplementary materials. The CIGS rear 
cell maintains the same performance when the 
front perovskite cell is removed, demonstrating 
that the fabrication and dissolving processes of 
the front subcell do not damage the CIGS de- 
vice. Similar PCEs are obtained for the reused 
tandem devices (fig. S12). 
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MOLECULAR MOTORS 


Directional control of a processive 


molecular hopper 


Yujia Qing, Sandra A. Ionescu, Gdk¢e Su Pulcu, Hagan Bayley* 


Intrigued by the potential of nanoscale machines, scientists have long attempted to control 
molecular motion. We monitored the individual 0.7-nanometer steps of a single molecular 
hopper as it moved in an electric field along a track in a nanopore controlled by a chemical 
ratchet. The hopper demonstrated characteristics desired in a moving molecule: defined 
start and end points, processivity, no chemical fuel requirement, directional motion, 

and external control. The hopper was readily functionalized to carry cargos. For example, a 
DNA molecule could be ratcheted along the track in either direction, a prerequisite for 


nanopore sequencing. 


rocessivity lies at the heart of biological 

machines. A replicative DNA polymerase 

can incorporate thousands of nucleotides 

before dissociating from its template (7). 

Molecular motors, such as kinesin and 
dynein, travel directionally along microtubules 
over hundreds of steps without detaching from 
the track (2-4). For years, scientists have been 
trying to build moving molecules that resemble 
their biomolecular counterparts but use sim- 
pler components (5). The ultimate goals are to 
achieve true processivity, which can be defined 
as directional motion without leaving a track 
and the performance of useful work such as the 
transport of a cargo. Ideally, a synthetic system 
should exhibit the reversibility of stepping seen 
in various biological systems (6, 7) to enable the 
direction of motion to be switched through ex- 
ternal control. 

We report the design of a one-legged molec- 
ular hopper that is ratcheted by dynamic co- 
valent chemistry along a protein track (Fig. 1A) 
with robust processivity (table S1). Further, 
the direction in which the hopper moves is sub- 
ject to external control by an electrical poten- 
tial. The track is built inside a protein nanopore, 
a-hemolysin (aHL), and consists of a series of 
cysteine footholds facing the lumen of the trans- 
membrane B barrel (Fig. 1B). The cysteines are 
evenly spaced along a B strand with an aver- 
age interfoothold distance of 6.8 A (Ca-Ca) and 
an average vertical spacing of 5.6 A. The hop- 
per uses consecutive thiol-disulfide interchange 
reactions to move in the direction in which the 
DNA cargo has been oriented by an applied po- 
tential. To execute the Sy2 reaction, the three 
participating sulfur atoms must align in a near- 
linear configuration (8-0). Under the applied 
potential, the DNA inside the barrel is pulled 
in the electric field with a force of ~10 pN (see 
supplementary materials). The force sets the 
overall direction of motion by flipping the DNA 
(see below) and helps to orient the disulfide for 
cleavage by the neighboring downstream cys- 
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teine thiolate, which moves the hopper one step 
forward, although other forces can contribute 
to the forward motion (Fig. 1C). Backstepping 
is disfavored and overstepping is impossible. 
Release of the hopper from the linear track 
was not observed, presumably because the track 
is too rigid to accommodate the resulting di- 
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sulfide bridge between adjacent footholds on 
the same f strand. In short, each step is chem- 
ically directional because the hopper’s “foot” is 
positioned to favor the forward reaction. Fur- 
ther, the motion is autonomous, requiring no 
chemical fuel. 

Under +150 mV, the hopper was delivered to 
the track from the cis compartment as a hopper- 
carrier conjugate (Fig. 1D) capped with a single 
traptavidin, which was arrested at the pore en- 
trance (Fig. 2A). The disulfide in the construct 
reacted strictly regioselectively with Cys", re- 
leasing the carrier and placing the hopper-DNA 
cargo on the starting foothold (Fig. 2A and fig. 
81). The location of the hopper was ascertained 
from the residual current passing through the 
nanopore, which reflected the length of the DNA 
located within the B barrel when the hopper was at 
a particular foothold (fig. S1). By monitoring cur- 
rent changes, we followed the stepwise hopping 
motion at the single-molecule level in real time. 

The voltage-controlled hopping motion was 
directional and processive. On a track contain- 
ing five cysteine footholds (at positions 113, 
115, 117, 119, 121), a hopper carrying an oligo- 
adenosine 40-mer (A40, hopper 1) moved cis 
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Fig. 1. A molecular hopper on a protein track. (A) A hopper carrying a cargo (red flag) moves 
along a track by means of consecutive thiol-disulfide interchange reactions. The overall direction is 
set by the applied potential. (B) A six-foothold track comprising odd-numbered cysteine residues 
on a B strand inside the aHL protein nanopore. (C) The applied potential exerts a force on the 
DNA cargo, which helps to align the three sulfur atoms (yellow) participating in the interchange. 
The collinear geometry promotes hopping (1) (movie S1) but not the formation of an intrastrand 
disulfide, which would release the hopper from the track (2). Occasionally, backstepping is 
observed (3). Overstepping (4) does not occur. (D) The hopper enters the nanopore as a carrier- 


hopper disulfide conjugate. 
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Fig. 2. Monitoring individual hopper steps. (A) Under +150 mV, a 
hopper-carrier conjugate capped with traptavidin was pulled from the cis 
compartment into an aHL nanopore containing cysteines at positions 
113, 115, 117, 119, and 121 in one of the seven subunits. The resultant 
blockade reduced the ionic current from (i) to (ii). Reaction of the disulfide 
in the hopper-carrier with Cys" covalently attached the hopper to the 
track, and the ionic current increased to (iii). (B) With a five-cysteine track, 
four hopping steps were observed at +150 mV. Every forward step moved 
part of the DNA cargo outside the B barrel, producing an increase in 
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conductance (movie S2). Alternation of the applied potential drove the hopper 
repeatedly up and down the track. (C) A hypothetical free energy diagram 
(not to scale) of the controlled hopping motion. (D) On an L-shaped track 
consisting of cysteines at positions 115, 117, 119, and 139, the hopper moved 
along the track from Cys!”® to Cys", where it was released by the side chain 
of Cys8°. Subsequently, a second hopper became loaded at Cys", but its 
motion was arrested at Cys"”” because Cys"? was now engaged in an 
interstrand disulfide bond [(i), (ii), (iii) as in (A)]. Backstepping to Cys"® was also 
seen. Conditions: 2 M KCI, 20 mM HEPBS, 20 uM EDTA, pH 8.5, 20° + 1°C. 
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Fig. 3. Discrimination of different DNA cargos. 
abasic nucleotides (dSdS) were substituted at positions 3 and 4 (hopper 2) 
or at positions 2 and 3 (hopper 3). The numbers of nucleotides (brown 
circles, dA; red circles, dS) placed inside the B barrel are based on PYMOL 
modeling. (B) With a five-cysteine track, four-step hopping was observed 
with hopper 2 at +150 mV. The current decreases for hops from 115 to 117 


and from 121 to 119 are marked (green arrows). (C) 


to trans under +150 mV, and trans to cis under 
-150 mV (Fig. 2B and fig. S2). When the hopper 
reached a terminal foothold, the sign of the 
applied potential was reversed in order to re- 
orient (flip) the DNA cargo, and hence the hop- 
per. Alternation between positive and negative 
potentials repeatedly drove the hopper toward 
the trans or the cis end of the track. For the 
DNA to experience a force, at least one nega- 
tively charged phosphodiester bond must lie 
within the electric field, which drops along the 
length of the pore’s f barrel (17). Therefore, in 
the present nanopore construct, the length of 
the track was limited and voltage-controlled 
hopping could only be demonstrated with up 
to six footholds (fig. $3). The hopping direc- 
tion could be changed by reversing the applied 
potential at any foothold; thus, our system of- 
fers complete control over directionality and 
the ability to move a hopper back to the ini- 
tial foothold after an outing. Moreover, the 
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applied potential provides an external energy 
source to produce directional motion (Fig. 2C; 
see supplementary materials). Limited by bi- 
layer stability, the longest records of proces- 
sive hopping were documented with hopper 1, 
which completed 249 forward steps in 93 min 
on a five-cysteine track (113 to 121) with a mean 
dwell time of ~22 s per foothold (fig. $2). Dis- 
sociation of the hopper from the track was 
never seen (7 > 30 outings on different tracks), 
which implies that substantial improvements 
on step numbers would be achieved if the sta- 
bility of the bilayer were improved. In compar- 
ison, previous synthetic small-molecule walkers 
moved directionally for less than 10 steps (5, 12). 
Wild-type kinesins typically exhibit a mean 
step number of 75 to 175 before dissociation 
(2, 3). 

The hopping rates for each of the four steps 
on the five-cysteine track were derived for both 
the cis-to-trans and trans-to-cis directions at 


31 August 2018 


a -- Threaded 


} +150 mv --———. -150 mv ———} +150 mv + -150 mv ——_| 


hopper 3 at £150 mV. The current transitions for hops from 117 to 119 and 
from 119 to 117 are marked (red arrows). (D) Top: Overlaid current traces of 
hoppers 1, 2, and 3 [colors as in (A)] with step durations normalized. The 
current levels are given as the residual current with respect to the open 
pore level (lres%). Bottom: Step sizes of hoppers 2 and 3 plotted as 
Alres%. Minima in the plots showing the single nucleotide offset are marked. 
Conditions: 2 M KCI, 20 mM HEPBS, 20 uM EDTA, pH 8.5, 20° + 1°C. 


pH 8.5 and displayed differences of less than 
a factor of 40 (0.0081 to 0.30 s‘; tables $2 to 
S4). Because a thiolate is the reactive nucleo- 
phile in disulfide interchanges, the rate differ- 
ences might arise from variations in the pK, 
values of the foothold thiols, which will be af- 
fected by neighboring residues. Previously, an 
arsenic(III) walker showed a factor of <50 dif- 
ference in attachment rates with the footholds 
on the same five-cysteine track (113 to 121) at 
DH 8.0 (22). In the future, tracks of thiols might 
be engineered with optimized interfoothold 
distances and enhanced chemical reactivity to 
speed up the hopping process. Alternatively, 
the properties of the reactive sulfur atom in 
the hopper might be manipulated by flanking 
functional groups. With both the two-cysteine 
and three-cysteine tracks, the influence of volt- 
age on hopping was examined at +100 mV, 
+150 mV, and +180 mV (tables S5 and S6). The 
rates showed weak nonexponential voltage 
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dependences, suggesting that the applied 
potential might not be the only source of 
propulsion (see supplementary materials). 
However, an electrical potential is essential 
(i) to flip the DNA over a large barrier to set the 
direction of motion, and (ii) to aid in the orien- 
tation of the three participating sulfur atoms 
to favor “forward” reactions over backsteps. 
With respect to the latter, the estimated “effec- 
tive concentrations” of participating downstream 
thiols are not especially high (see supplemen- 
tary materials) (13) and indeed need not be 
high to produce overall forward motion (see 
below). 

Although disfavored, backstepping was occa- 
sionally detected, which we attributed to con- 
formational lability of the hopper within the 
nanopore even under an applied potential. During 
a recording with hopper 1 on a five-cysteine track 
(113 to 121), there were 33 backward steps on the 
nonterminating footholds out of 282 steps in 
total (12%). Of the 33 backward steps, 29 occurred 
from 115 to 117 at -150 mV (table S3). Despite the 
large forward equilibrium constant for 117-to-115 
stepping (K = Ayy7z-115/Kus-117 = 22), 115-to-117 
backstepping was observed because of the com- 
paratively slow forward movement to the next 
foothold, 113 (at -150 mV, kys-117 = 0.0094 s 7, 
kus-n3 = 0.0081 s’; tables S2 and $3). Back- 
stepping was observed when the hopper was left 
on a terminal foothold, as no forward footholds 
remained (Fig. 2B). These backsteps were quickly 
reversed by the hopper, which preferentially re- 
sided at the final station (K = Kyp-11/ki1-119 = 
5.2; table S4). The overall motion of the hopper is 
governed by the product of the K values for each 
step. A modest value (K > 1) at each step pro- 
duces a considerable overall tendency toward 
forward movement. 

Each subunit of the wHL pore offers two anti- 
parallel f strands to the transmembrane f barrel 
with an interstrand distance of ~5 A (Ca-Ca). 
Given that the formation of cross-strand disul- 
fides has been reported (/4), we reasoned that 
the addition of a cysteine on an adjacent strand 
would compel hopper release from the track at a 
designated foothold. Indeed, with an L-shaped 
track consisting of cysteines at positions 115, 117, 
119, and 139, the hopper attached to the track at 
foothold 115 by regioselective disulfide formation 
and dissociated from the track when it reached 
foothold 119. The release was initiated by Cys’® 
through thiol-disulfide interchange to form a 
cross-strand disulfide bridge, which blocked the 
access of subsequent hoppers to foothold 119 


Qing et al., Science 361, 908-912 (2018) 


31 August 2018 


(Fig. 2D). The preference for hopper release 
versus hopper transfer to the adjacent strand 
is attributed to an unfavorable collinear align- 
ment of the three participating sulfur atoms 
necessary for transfer. In the future, the engi- 
neering of footholds on a surface will allow the 
construction of more complex hopping pathways 
where hoppers are transferred to new tracks at 
designed junctions and cargos are released at 
predesignated depots. 

The ability to translocate a stretched DNA car- 
go while maintaining a covalent bond with the 
nanopore suggests a method for the chemical 
ratcheting of a nucleic acid during nanopore 
sequencing (15), which was explored in a proof- 
of-concept experiment. To provide a marker, 
we incorporated two adjacent abasic residues 
(1',2’-dideoxyribose, dS) (16) into the cargo oligos 
carried by hoppers 2 and 3 and recorded current 
patterns during four-step hopping between 
Cys"? and Cys”! (Fig. 3A). By comparison with 
hopper 1, hoppers 2 and 3 showed different 
patterns of current modulation (Fig. 3, B and 
C, fig. S4, and table $7). The conductance pat- 
terns generated by the four-step hopping motion 
could be repeated with different molecules of 
hoppers 2 and 3 (mz = 3 for each hopper), es- 
tablishing the patterns as clear identifiers of each 
cargo sequence. The residual currents (Ires%, the 
remaining current as a percentage of the open 
pore current) for the three hoppers residing at 
each foothold were plotted for comparison 
(Fig. 3D). Hoppers 1 and 2 gave almost identical 
current blockades at each of footholds 115 and 
113 under -150 mV, implying that the dSdS se- 
quence had been transported well out of the 
sensing region by hopper 2. Moreover, hoppers 
2 and 3 have a single nucleotide offset in the 
dSdS positions, and we observed a one-step offset 
between hoppers 2 and 3 in Alres%, the difference 
in Ires% between two successive steps [Fig. 3D; 
the vertical step size, 5.6 A, is similar to the 
internucleotide distance in stretched single- 
stranded DNA, 6.9 A (17)]. 

These observations demonstrate that the hop- 
per system reported here has the potential to 
discriminate bases for sequencing purposes (16). 
An advantage of a processive hopper, which might 
improve sequencing accuracy, is the ability to 
reverse the chemical ratcheting process and 
thereby obtain many-fold coverage of an individ- 
ual DNA strand. Of course, the present system 
is limited by its short track. Although longer 
B-barrel pores exist (78, 19), a viable sequencing 
process will require protracted ratcheting of 


numerous DNA strands in parallel, perhaps by 
using footholds on an extended crystalline sur- 
face or internal thiophosphate feet to transport 
long replica strands over relatively short tracks. 
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Observation of alkaline earth 
complexes M(CO)x, (M = Ca, Sr, or Ba) 
that mimic transition metals 


Xuan Wu", Lili Zhao”, Jiaye Jin’, Sudip Pan”, Wei Li’, Xiaoyang Jin’, Guanjun Wang’, 


Mingfei Zhou'}, Gernot Frenking”*+ 


The alkaline earth metals calcium (Ca), strontium (Sr), and barium (Ba) typically engage in 
chemical bonding as classical main-group elements through their ns and np valence 
orbitals, where n is the principal quantum number. Here we report the isolation and 
spectroscopic characterization of eight-coordinate carbonyl complexes M(CO)g (where 

M = Ca, Sr, or Ba) in a low-temperature neon matrix. Analysis of the electronic structure of 
these cubic O,-symmetric complexes reveals that the metal-carbon monoxide (CO) 
bonds arise mainly from [M(d,)] > (CO)g x backdonation, which explains the strong 
observed red shift of the C-O stretching frequencies. The corresponding radical cation 
complexes were also prepared in gas phase and characterized by mass-selected infrared 
photodissociation spectroscopy, confirming adherence to the 18-electron rule more 
conventionally associated with transition metal chemistry. 


he periodic table of the elements is con- 

ventionally divided according to the valence 

atomic orbitals (AOs) into main-group s 

and p blocks, a transition metal d block, and 

a lanthanide and actinide f block. A useful 
set of guidelines for understanding the struc- 
tures and stabilities of molecules encompasses 
the associated 8-, 18-, and 32-electron rules in- 
troduced by Langmuir (J, 2) before the advent of 
quantum theory. These rules were later explained 
by attributing particular stability to filled sp, spd, 
or spdf valence shells, respectively (3). 

The alkaline earth elements beryllium, mag- 
nesium, calcium, strontium, and barium have a 
ns? valence-shell configuration, where 7 is the 
principal quantum number, and, as such, typi- 
cally engage in chemical bonding as ionic salt 
compounds or in polar bonds via their two ns 
valence electrons in divalent M(II) species (4), 
where M is an alkaline earth metal. Earlier studies 
suggested that the heaviest atom barium may use 
its 5d AOs to some extent in chemical bonds (5), 
which led to the suggestion that barium be 
designated an “honorary transition metal” (6). 
Previously, we reported the experimental obser- 
vation of barium carbonyl ions Ba(CO)? (where 
charge g = +1 and -1) (7). The analysis of the 
electronic structure showed that the cation 
binds the ligand mainly through Ba*(5d,") > 
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CO(nx* LUMO) backdonation (LUMO, lowest 
unoccupied molecular orbital), with Ba* in the 
excited 7D (5d) electronic reference state. In that 
respect, the Ba(CO)* complex behaves sim- 
ilarly to a transition metal carbonyl. The Ba-CO 
interactions in the radical anion Ba(CO) were 
consistent with dominant contributions of Ba 
(5d,.) < CO (r* SOMO) x donation (SOMO, singly 
occupied molecular orbital) and Ba(5d,/6s) — 
CO (o HOMO) o donation (HOMO, highest oc- 
cupied molecular orbital). The most important 
valence functions of barium in Ba(CO)* cation 
and Ba(CO) anion thus appeared to be the 5d 
orbitals (7). 

These findings inspired us to search for the 
18-electron octacarbonyl complex Ba(CO).. Sur- 
prisingly, we found that not only barium but also 
the lighter homologs strontium and calcium 
formed octacarbonyl complexes M(CO)x: (M = 
Ca, Sr, or Ba) that can be stabilized in a low- 
temperature neon matrix. 

The neutral alkaline earth-carbonyl complexes 
were prepared by the reactions of pulsed laser- 
evaporated metal atoms and carbon monoxide 
(CO) in solid neon and were investigated using 
Fourier transform infrared absorption spectros- 
copy. The experiments were carried out with a 
wide range of CO concentrations (from 0.02 to 
2% relative to Ne on the basis of volume). In 
the experiments with relatively low CO con- 
centrations, terminally bonded mononuclear 
low-coordinate carbonyl complexes with C=O 
stretching frequencies in the 2050 to 1800 cm™ 
region were observed. Experiments with isotopi- 
cally substituted CO samples allowed the unam- 
biguous identification of some low-coordinate 
complexes through isotopic shifts and splittings. 
The barium di-, tri-, and tetracarbonyls can clear- 
ly be identified on the basis of spectra in the ex- 
periments with 0.03% ”C"°0 (fig. $1); 0.05% "CO 
and 0.05% "CO (fig. $2); and 0.05% C'®O and 


0.05% C'80 (fig. S3). The monocarbonyls were 
theoretically predicted to be unstable (8) and 
were not observed in the experimental vibrational 
spectra. Intense absorption bands centered at 
1987 cm“ for Ca, 1995 cm‘ for Sr, and 2014 cm 
for Ba were observed upon progressive anneal- 
ing of the samples to temperatures of 10 to 13 K 
under relatively high CO concentrations (Table 1). 
These absorptions become the dominant features 
in the spectra with high CO concentrations (see 
Fig. 1A for Ca and figs. S4 and S5 for Sr and Ba), 
suggesting that the absorber is the coordinatively 
saturated 18-electron octacarbonyl complex. The 
observation of only one carbonyl stretching band 
suggests that these neutral octacarbonyls have 
the highest cubic O; symmetry. Experiments with 
mixtures of “C"6O and ¥C'°0 and also ’C"°O and 
C180 provided conclusive identification of these 
cubic octacarbonyl complexes. Although the bands 
of the Sr and Ba complexes are too broad to re- 
solve isotopic splittings, the band of the Ca complex 
is sharp and intense in the spectra with relatively 
low CO concentrations (Fig. 1A); well-resolved 
mixed isotopic spectra could therefore be compared 
with calculations. The experimentally observed 
spectra are in good agreement with the simulated 
isotopic spectral features shown in figs. S6 and S7. 

The radical cations of the alkaline earth- 
carbonyl complexes were prepared in the gas 
phase by using a pulsed laser vaporization- 
supersonic-expansion ion source and studied by 
mass-selected infrared photodissociation spec- 
troscopy in the carbonyl stretching-frequency 
region. Typical mass spectra are shown in figs. 
S8 to S10. Mononuclear metal-carbonyl cation 
complexes [M(CO),,]* (M = Ca, Sr, or Ba) with 
n as high as 10 to 15 were observed. These 
larger complexes contain both strongly bound CO 
ligands directly coordinated to the central metal 
ion and weakly bound, or tagged, CO ligands 
(9, 10). All of these complexes dissociated by 
elimination of a CO ligand after photoexcitation 
at the C=O stretching vibrational frequencies. 
The infrared spectra of [Sr(CO),.]* ( = 6 to 9) are 
shown in Fig. 1B and the corresponding spectra 
of the Ca and Ba complexes in figs. S11 and S12. 
All the spectra for the [M(CO),,]* complexes with 
n = 6 to 8 feature a broad single band [full width 
at half maximum (FWHM) of 49 cm‘! for Ca, 
52cm’ for Sr, and 29 cm’ for Ba for the n = 8 
complexes] that is slightly red shifted relative 
to the free-CO absorption at 2143 cm”. The dis- 
sociation efficiency increases substantially (about 
70% for Ca, 220% for Sr, and 107% for Ba) from 
n = 8 to 9, and the n = 9 complexes have a much 
narrower bandwidth [FWHM of 36 cm” for 
Ca, 10 cm for Sr, and 16 cm‘! for Ba]. Along 
with the intense band around 2100 cm’, an ad- 
ditional weak band in the 2160 to 2180 cm™ 
region is also observed for the n = 9 complexes. 
The bands in this latter frequency region are 
assigned to the vibrations of weakly tagged 
external CO ligands (9, 10). The appearance of 
the tagged CO band at n = 9 indicates that the 
n = 8 complexes are coordinatively saturated. 

We carried out quantum chemical calculations 
using density functional theory and ab initio 
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methods to support the assignments of the vi- 
brational spectra to the observed species and to 
examine the electronic structure of the carbonyl 
complexes. Figure 2A shows the optimized geom- 
etries of the neutral octacarbonyls calculated at 
the M06-2X-D3/def2-TZVPP level. The molecules 
have cubic (O;,) symmetry and a triplet Age) 
electronic ground state with the valence electron 
configuration ay,’ tin tog°Aou ee. Calculations of 
the singlet state (fig. S13) gave structures with 
Dyq (Ca and Sr) or D4, symmetry (Ba), which 
were between 6.5 and 7.5 kcal/mol higher in 
energy than the corresponding triplet state 
species. Figure 2, B and C, shows the equilibrium 
geometries of the cations [M(CO).]*, which re- 
semble the neutral species in the singlet state. 
Thus, [Ca(CO),]* and [Sr(CO)x,]* possess a D4g 
structure and a A, electronic ground state, 
whereas [Ba(CO)g,]* has Dy, symmetry and a "Bos 
electronic state. Figure 2 also shows the cal- 
culated zero-point energy (ZPE)-corrected bond 
dissociation energies (Do) of the octacarbonyls 
for loss of one and eight CO ligand(s). At the 
M06-2X-D3/def2-TZVPP level, the Dy values for 
the dissociation of one CO lie between 9.1 kcal/ 
mol for Ca and 11.5 kcal/mol for Sr for the M(CO)g 
neutral complexes and between 8.4 kcal/mol for 
Ba and 9.6 kcal/mol for Sr for the [M(CO),]* 
cation complexes. The theoretical Do values for 
loss of eight CO ligands yielding M/M* in the 
electronic ground state were between 58.8 kcal/ 


mol for Sr and 63.3 kcal/mol for Ca for the 
neutrals and between 73.5 kcal/mol for Ba and 
87.6 kcal/mol for Ca for the cations. The calcu- 
lated values at the CCSD(T)/def2-TZVPP level 
using the M06-2X-D3/def2-TZVPP optimized 
geometries were slightly smaller. The basis set su- 
perposition error is quite small (0.3 to 0.4 kcal/mol 
for single CO dissociation and 0.6 to 1.0 kcal/mol 
for loss of eight COs). 

The theoretical wave numbers for the C=O 
stretching modes of M(CO), and [M(CO).,]* are 
shown in Table 1 along with the experimental 
values. The calculated values refer to the harmonic 
antisymmetric stretching frequencies scaled by a 
factor of 0.941, which comes from the ratio of the 
calculated stretching mode of free CO (2278 cm™) 
to the experimental value (2143 cm’) (11). The 
calculated harmonic wave numbers are slightly 
higher than the experimental anharmonic values, 
but the trends for the different metals and for the 
isotopes of the neutral systems are in excellent 
agreement with the recorded values. The cal- 
culations suggest only one infrared (IR) active 
mode for the neutral complexes M(CO)g and two 
closely lying IR active modes for the cations 
[M(CO),]*. The latter splitting is too small to 
be experimentally observed. Theory and experi- 
ment indicate a considerable red shift of the C=O 
frequency for the neutral systems and a much 
smaller red shift for the cations. The calculated 
red shifts and the isotopic variations of the cal- 
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D, = 11.5 (11.1) 58.8 (41.5) kcal/mol, M = {St} 
D, = 10.4 (8.7) 62.8 (49.2) kcal/mol, M = [Ba] 


D, = 9.6 (9.7) 7 
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D, = 9.5 (9.2) 87.6 (76.0) kcal/mol, M = Ca 


D, = 8.4 (8.5) 73.5 (66.2) kcal/mol 
7.8 (68.4) kcal/mol, M = {Sr} 
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culated and experimental wave numbers match 
each other. 

Figure 3 shows the splitting of the valence 
orbitals of atoms with a spd valence shell in 
an octacoordinate cubic field with O;, symmetry 
(12). The breakdown of the AOs into irreducible 
representations of the O; point group in an 
eight-coordinate ligand field is similar to the 
splitting in a six-coordinate octahedral field 
(13, 14), but there are two important differences. 
One concerns the splitting of the (n - 1)d AOs 
of the metal. The degenerate e, AOs in the 
octacoordinate cubic field are the (n —- 1)d, 
AOs, and the triply degenerate t., AOs are the 
(n - 1)d, AOs. By contrast, in the six-coordinate 
octahedral field, the degenerate e, AOs are the 
(n - 1)d, AOs, and the triply degenerate ty, AOs 
have (n — 1)d, character. The second difference 
concerns the appearance of the ao, molecular 
orbital (MO) in the cubic field (Fig. 3), which is 
absent in the octahedral field. The aj, MO is a 
ligand-only orbital, because there is no valence 
AO of the spd shell that possesses this symmetry. 
This has consequences for the number of electrons 
that are required to fulfill the 18-electron rule. 
Because two valence electrons of the ligands in a 
cubic field are not available for donation to the 
central metal atom M in ML g, where Lis a ligand, 
the complex must provide a total of 20 electrons 
to fully occupy the metal’s valence shell. This sce- 
nario has recently been explored in the transition 


Fig. 1. IR spectra of alkaline earth carbonyl 
complexes. (A) IR absorption spectra 

of calcium-carbonyl complexes in 

the 2150 to 1850 cm region from 
codeposition of laser-evaporated calcium 
atoms with 0.1% CO in neon. 

Spectral lines: (i) after 30 min 

of sample deposition at 4 K, (ii) 

after a 12-K annealing period, (iii) 

after a 13-K annealing period, (iv) after 

15 min of visible light irradiation, 

and (v) after another 12-K annealing 
period. (B) IR photodissociation 

spectra of the Sr(CO),,* (n = 6 to 9) 
complexes in the 2300 to 1800 cm” region. 


Fig. 2. Calculated equilibrium geometries 
of alkaline earth octacarbonyls. (A) M(CO)g 
(M = Ca, Sr, or Ba), (B) [M(CO)g]* 

(M = Ca or Sr), and (C) [Ba(CO)g]*. 

Bond lengths are in angstroms. The Do 

values in roman type are the ZPE-corrected 
bond dissociation energies for loss of 

one CO ligand; the italicized values 

are the corresponding energies for the 

loss of eight CO ligands and M/M* in 

the ground state. The values without 
parentheses are from MOQ6-2X-D3/def2-TZVPP 
calculations; the values in parentheses 

are from CCSD(T)/def2-TZVPP using 
MO06-2X-D3/def2-TZVPP optimized geometries. 
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metal octacarbonyl anions [TM(CO),] (TM = 
Sc, Y, or La) (15). Only 18 electrons are available 
in M(CO)x (M = Ca, Sr, or Ba), so the degenerate 
e, MO is occupied by two electrons with the 
same spin, giving a triplet Ade) electronic ground 
state. Because the e, correlates with the (n — 1)d, 
AO, it becomes clear that the electronic refer- 
ence state of the alkaline earth atom in M(CO)g 
is a triplet state with ns°(n — 1)d? electron con- 
figuration. The metal center has a zero formal 
oxidation state. 

We analyzed the nature of the metal-CO bonds 
with the energy decomposition analysis-natural 
orbitals for chemical valence (EDA-NOCV) (J6) 
method, a powerful tool that provides detailed 
insight into chemical bonding (17). A description 
is given in the methods section. Table 2 shows 
the calculated results for the interactions be- 
tween the metal atom M in the triplet electronic 


M(CO), 


reference state with a ns°(n — 1)d? electron con- 
figuration and the (CO)g cage at the frozen ge- 
ometry of M(CO)s. The interaction energies AF nt 
suggest that the intrinsic attraction M-(CO), is 
strong and varies in the order Ca > Sr >> Ba. The 
dominant contribution to the total interaction 
AE; comes from the orbital term AE. The prep- 
aration energies (AE}yep) involved in the forma- 
tion of the (CO), cage from free CO molecules are 
low, ranging from 3.3 kcal/mol for Ba to 13.9 kcal/ 
mol for Ca, whereas the electronic excitation 
energies for the atoms into the spherically sym- 
metric ns” — ns°(n — 1)d? triplet state of M are 
quite high, lying between 68.2 kcal/mol for Ba 
and 159.5 kcal/mol for Ca. The strongly stabi- 
lizing interaction energies AEF; caused by the 
attraction between the CO ligands and elec- 
tronically excited M overcompensate the AE,,., 
values. The adiabatic interaction energy AE (=—D., 


e, Lf t. - tu 
e,+ t+ th, 


so 2cc« 


a,,t tt a,,+ t, 


5o 


8CO 


Fig. 3. Bonding scheme and shape of the occupied valence orbitals of M(CO)g (M = Ca, Sr, 

or Ba). Splitting of the spd valence orbitals of an atom M with the configuration (n — 1)d@ns°np° in 
the octacoordinate cubic (O,) field of eight CO ligands is also given. Only the occupied valence 
orbitals that are relevant for the M-CO interactions are shown. Up and down arrows indicate 


electrons with opposite spin. 


where D, is the dissociation energy without 
zero-point vibrational energy correction) with 
respect to the electronic ground state of M and 
eight CO is between —65.5 kcal/mol for Sr and 
-73.7 kcal/mol for Ba. The D, values in Table 2 
exhibit a similar trend as the Dp data in Fig. 2. 
The former values do not consider ZPE contribu- 
tions, and they are obtained from calculations 
with different basis sets using Slater-type orbi- 
tal basis functions. 

The most important insight from the EDA- 
NOCV calculations is the breakdown of AE>,, 
into pairwise orbital interactions. Table 2 shows 
that the metal-CO bonding comes mainly from 
the [M(d,)] — (CO), x backdonation of the de- 
generate (e,) set of singly occupied (m — 1)d AOs 
of the metal into the antibonding n* MOs of 
CO. This explains the large red shift of the C=O 
stretching mode of the octacarbonyls. The con- 
tribution of the [M(x)] <— (CO), o donation into 
x, where x denotes the valence acceptor AO of 
M, is much smaller than the [M(d,)] — (CO), 
m backdonation. The order of the acceptor AOs 
of the metal atoms for the interaction energy is 
(n — 1)d, > ns > np. There is also a small sta- 
bilizing contribution of the aj, MO, which is due 
to the polarization of the (CO)g ligand orbitals. 

The dominating orbital interactions by the 
valence d electrons of M can be explained with 
the energetically very-high-lying occupied orbitals, 
which make M atoms excellent donor species. 
In our previous study of Ba(CO)*, we found that 
the cation Ba* (5d,) is a donor to neutral CO; the 
atomic partial charge of Ba in Ba(CO)* was cal- 
culated as +1.39 e. The experimental values for 
the energetically lowest-lying °F state with the 
electron configuration ns°(n — 1)d? are not far 
from the ionization limit (78). The valence d elec- 
trons of the M atoms are only weakly bonded to 
the atoms. 

Figure 3 also shows the occupied valence MOs 
of Ca(CO)g that are relevant for the metal-CO 
bonding. The shape of the five (19) MOs reveals 
that the contributions of the metal AOs are very 
small, except in the SOMO, which clearly exhibits 
the shape of the (7 — 1)d,, AOs. The valence MOs of 


Table 1. Calculated and experimental wave numbers. Calculated (MO6-2X-D3/def2-TZVPP) and experimental wave numbers (cm?) of the C-O stretching mode 
and frequency shifts of M(CO)g and [M(CO)g]* (M = Ca, Sr, or Ba). The calculated values are scaled by 0.941. Blank spaces indicate that isotope values are not available. 


Calculated wave numbers 


Complex 
83609 


Ca(CO)z 


[Ba(CO)z]* 


At 


Experimental wave numbers 


A* 


BCO 


*Frequency shift relative to free CO. 
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tisotopic frequency shift. 
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EEE 
Table 2. EDA-NOCV results. EDA-NOCV results for triplet state M(CO)g (M = Ca, Sr, or Ba) complexes at the MO6-2X/TZ2P-ZORA level using MO6-2X-D3/ 
def2-TZVPP optimized geometries, taking (CO)s in singlet ground state and M in triplet excited state with a ns°(n — 1)d? valence electronic configuration as 
interacting fragments. Energy values are given in kcal/mol. 


Energy term 


Assignment 


Interacting fragments 


Ca + (CO)s 
-243.1 


Sr + (CO)g 
-224.1 


Ba + (CO)s 
-145.2 


[M(d)] > (CO)g x backdonation 


M(CO)s > M (7S) + 8 CO 


Ve, ake) 
150.9 68.2 
S155) =1/3}\J/ 


*Contribution of the metahybrid term in MO6-2X. 
AEgistat iS the electrostatic interaction energy. 
or three (tz, or ty) components is given. 


+The values in parentheses show the contribution to the total attractive interactions AEcistat plus AEorb, where 
+The values in parentheses show the contribution to the total orbital interaction, AEorp. 
[The EDA calculations give a triplet state with spherically symmetrical distribution of the d electrons. The experimental 


§The sum of the two (eg) 


values for excitation into the energetically lowest-lying 3F state with ns°(n — 1)d? configuration are 124.2 and 59.8 kcal/mol for Ca and Ba, respectively. There is no 
experimental value for the relevant 3F state of Sr. The data are taken from (18). 


the heavier homologs Sr(CO)s and Ba(CO)g look 
very similar (see figs. S14 and S15). The effect of 
the orbital interactions on the charge distribu- 
tion is evident from the shape of the deformation 
densities Ap, which are associated with the or- 
bital interactions. Figure S16 shows the contour 
line plots of the deformation densities Apq) to 
Ap), which are connected to the pairwise in- 
teractions AE ona) to AE ors) in Ca(CO)g (Table 2). 
Note that only one component of the orbital 
terms is shown, and the color code of the charge 
flow is from red to blue. Figure S16A displays a 
large charge flow in the direction [Ca(d,)] — 
(CO)s, which comes from the x backdonation. 
Figure S16, B to D, exhibits the charge flow in 
the opposite direction [Ca(x)] <— (CO)g into the 
valence AOs of calcium (n — Id,, ns, and np. 
Figure S1I6E shows the charge polarization of 
the (CO), asy ligand orbital, where electronic 
charge is shifted from oxygen toward carbon. 
The deformation densities Apq) to Apys) of the 
orbital interactions in Sr(CO)g and Ba(CO), are 
shown in figs. S17 and S18, respectively. 

The analysis of the electronic structure of the 
octacarbonyls M(CO)g provides a straightforward 
explanation for why the molecules have a cubic 
(O,) equilibrium geometry and a triplet Age) 
electronic ground state with a triplet refer- 
ence state and ns°(n — 1)d? electron configura- 
tion. The coordination by eight CO ligands in 
a cubic field fills the spd valence AOs of the 
alkaline earth atoms and the a», ligand-only 
MO. The energetically highest-lying bonding 
MO is the degenerate e, orbital. Because 16 elec- 
trons from the ligands and 2 electrons from 
metal are available, the e; MO has SOMO com- 
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ponents following Hund’s rule. Simple electron 
counting indicates that the octacarbonyls M(CO)g 
fulfill the 18-electron rule. The EDA-NOCV cal- 
culations of the cations [M(CO),]* agree with 
the bonding analysis of the neutral complexes. 
The [M(d,,)]* — (CO), x backdonation is weak- 
er than in the neutral species, because there is 
only one (v — 1)d, electron available (tables 
Sl and S82). 

The bonding situation in the alkaline earth- 
octacarbonyl complexes shows that not only 
barium but also strontium and calcium may ef- 
fectively use their (7 — 1)d AOs in chemical bond- 
ing. It is conceivable that the chemical reactivity 
of heavier alkaline earth elements is more di- 
verse than hitherto thought. Recent reports about 
unusual structures and reactivities of calcium and 
strontium compounds (20, 27) could be a hint that 
d-orbital participation of the alkaline earth metals 
plays an important role. The design of future 
experiments should consider the capacity of the 
heavier alkaline earth elements to behave like 
transition metals. 
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CLIMATE CHANGE 


Increase in crop losses to insect 
pests in a warming climate 


Curtis A. Deutsch”?*+}, Joshua J. Tewksbury**”*}, Michelle Tigchelaar®, 
David S. Battisti®, Scott C. Merrill’, Raymond B. Huey”, Rosamond L. Naylor® 


Insect pests substantially reduce yields of three staple grains—rice, maize, and wheat—but 
models assessing the agricultural impacts of global warming rarely consider crop losses 
to insects. We use established relationships between temperature and the population 
growth and metabolic rates of insects to estimate how and where climate warming will 
augment losses of rice, maize, and wheat to insects. Global yield losses of these grains are 
projected to increase by 10 to 25% per degree of global mean surface warming. Crop 
losses will be most acute in areas where warming increases both population growth 

and metabolic rates of insects. These conditions are centered primarily in temperate 


regions, where most grain is produced. 


y 2050, growing-season temperatures will 

likely exceed those recorded during the 

past century and may substantially reduce 

crop yields (J-4). However, models assess- 

ing the effects of climate warming on crop 
yields rarely consider impacts on insect pests, 
despite the damages that result directly from 
pest infestations and indirectly from pesticides 
applied to reduce pest damage (5, 6). In the fu- 
ture, pest species are likely to differ in their re- 
sponses to warming, changing the relative impacts 
of pests geographically and among crops (7, 8). 
Here we use well-established relationships be- 
tween temperature and the physiology and de- 
mography of insects to project the future impact 
of insects on crop production globally and region- 
ally. We estimate pest-related changes in yields 
of the major grain crops maize, rice, and wheat, 
which together account for 42% of direct calories 
consumed by humans worldwide (9). 

A warmer climate will alter at least two ag- 
riculturally relevant characteristics of insect 
pests. First, an individual insect’s metabolic rate 
(M) accelerates with temperature, and an insect’s 
rate of food consumption must rise accordingly 
(10-12). Second, the number of insects (7) will 
change, because population growth rates of in- 
sects also vary with temperature. These growth 
rates are expected to decline as a result of warm- 
ing in tropical regions while rising elsewhere 
(8) (fig. S1). The total energy consumption of a 
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pest population (the “population metabolism”) 
is proportional to the product of these two fac- 
tors and directly relates to the crop yield loss 
(L) caused by insect herbivory. Fractional changes 
in pest-induced crop losses (AL/L) can thus be 
partitioned into a metabolic component (AM/M) 
and a demographic component (An/n) (73). The 
sum of these fractional changes approximates 
the total fractional change in yield loss 


AL AM An ‘7 
TM tn 0) 

To evaluate how warming changes the pop- 
ulation metabolism of insect pests, we inte- 
grated established physiological responses of 
insects to temperature into a spatially explicit 
demographic model (13). The metabolic and 
population growth rates were derived from 
laboratory experiments across a wide range of 
temperatures and for diverse insect taxa in- 
cluding pest species. Relationships between tem- 
perature and insect population growth rates 
drive logistic population increases of insects 
during each crop’s growing season, and they also 
scale the fractional survival rate of insects over 
the rest of the year (J4), termed the diapause sur- 
vival, 6. We calibrated key demographic mod- 
el parameters—population size and carrying 
capacity—using contemporary crop yields (15) 
and their insect-related losses, measured for 
our three focal crops at sites around the world 
(5). To predict future changes in population 
growth and metabolic rates, we added projected 
monthly surface temperature anomalies from 
climate model simulations under a “business-as- 
usual” emissions scenario (RCP8.5) (J6) to the 
observed daily and seasonally varying temper- 
atures from the 20th century (1950 to 2000). 
Results are presented for several climate models 
that span a range of climate sensitivities and 
for a range of uncertainties in biological traits 
and assumptions (73). We report yield losses as 
a function of global mean surface temperature 
change, making the results comparable across 
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emissions scenarios, time periods, and climate 
sensitivities. 

Crop production losses to pests increase 
globally with rising temperatures in all climate 
models and across all biological parameters 
(Fig. 1). When average global surface temperatures 
increase by 2°C, the median increase in yield 
losses owing to pest pressure is 46, 19, and 31% 
for wheat, rice, and maize, respectively, bringing 
total estimated losses to 59, 92, and 62 metric 
megatons per year. These projected losses are 
similar across all climate models and are thus 
robust to uncertainties in both global and re- 
gional warming patterns, although the time at 
which such damage levels are reached depends 
on the emissions scenario and on each model’s 
sensitivity to increasing atmospheric CO, (Fig. 1D) 

The differences in global grain losses between 
crops and across model parameters (Fig. 1) reflect 
the distinct spatial patterns of demographic and 
metabolic impacts of warming on insect pests in 
the climates where these crops are grown. In 
temperate regions, warming increases both the 
size of insect populations and their per capita 
metabolic rate (Fig. 2, right). As a result, the in- 
crease in pest-related crop loss is consistently 
larger than in tropical regions, where the in- 
creasing metabolic rate is offset by declining 
population growth rates, resulting in a smaller 
overall rise in crop damages. This broad geo- 
graphic pattern holds across all crops, climate 
models, and life history parameters considered 
(Fig. 2 and figs. S3 and S4). 

The contribution of per capita metabolic rates 
to the total pest-induced crop losses is projected 
to increase consistently across regions and over 
time. For each of the three crops examined here, 
increases in temperature vary only modestly 
across growing regions and seasons, causing 
a nearly uniform fractional rise in the metabolic 
rates of the insect pests (Fig. 1). The magnitude of 
the metabolic component (Eq. 1) is proportional 
to the temperature sensitivity of metabolic rates, 
Emet, Which varies by <50% across insect species 
(Emet = 0.65 + 0.15; mean + standard deviation) 
(12). As a result, the metabolic component of 
insect pest population metabolism can be esti- 
mated relatively robustly at both regional and 
global scales. 

In contrast, the demographic component of 
future crop loss to insect pests is spatially vari- 
able and can either exacerbate or ameliorate the 
impact of rising metabolic rates (Fig. 1 and figs. 
$3 and S4). In the lowland tropics, pest pop- 
ulations are predicted to decline because current 
temperatures there are already near optimal, so 
warming should reduce population growth rates 
(8) (fig. S2). On the other hand, extratropical pest 
populations are generally projected to grow as 
temperatures become closer to optimal, with a 
small contribution from increasing diapause sur- 
vival as winters warm (J4) (fig. S6). Because tem- 
perate populations often reach carrying capacity 
only late in the growing season, if at all, they 
have the most potential for increases in popula- 
tion size as temperature rises (fig. S2). How much 
they increase depends on baseline survival rates 


1 of 4 


8L0z ‘og IsnBny uo /610 beweoualos’e0ue!0s//:dy]4 Wo.) pepeojuMOGg 


RESEARCH | REPORT 


150 
A, =1 (metabolic) Wheat : 

100} | ° &=107 apne 
© %=104 


50 Janson 7d 
l current loss 
A u 
(@) i 1 1 f | 
0 1 2 3 4 5 
150 1 1 
Rice 


100 


current loss 


Total Production Loss (Mt/yr) 


50 | | | l l 

0 1 2 3 4 5 

1 00 T T Maize T T 
80 a epeln® | 


60 | 
ot current loss | 
0 1 : : i 
2150 4 i “ 
q oF a 
vet ROPAS| <i —S 
2100 |= = RCP6.O) wa 
. —=—=RCP8.5 
$ ; 
2050 F | 
D 
2000 3 : : 


3 4 5 


Global Temperature Anomaly Po) 


Fig. 1. Global loss of crop production owing to the impact of climate warming on insect pests. 
Crop production losses for (A) wheat, (B) rice, and (C) maize are computed by multiplying the 
fractional change in population metabolism by the estimated current yield loss owing to insect pests, 
summed over worldwide crop locations. Results are plotted versus mean global surface temperature 
change, for four climate models (13), for two different values of the demographic parameter 
governing survival during diapause (4, = 0.0001, asterisks; @, = 0.001, circles), and for the metabolic 
effect alone (triangles). Mt/yr, metric megatons per year. The year in which a given global mean 
temperature anomaly is reached (D) depends on the greenhouse gas emissions scenario (RCP, 
representative concentration pathway) and varies across models (shading) owing to uncertainty in 


climate sensitivity to those emissions (13). 


during the nongrowing season (6,), which can be 
highly variable. However, the pattern of weak de- 
mographic impacts in tropical regions and strong 
deleterious impacts in northern temperate regions 
is consistent across a wide range of plausible 6, 
values, from 0.0001 to 0.01 (fig. S3). 

Because our three focal crops are grown in 
different climates, where warming can induce 
opposite changes in insect population growth 
rates, diapause survival differentially affects losses 
of these three crops. For wheat, which is typically 
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grown in relatively cool climates, warming will 
increase pest population growth and overwinter 
survival rates, leading to large population increases 
in the growing season (Fig. 1A). In rice, which is 
grown in relatively warm tropical environments, 
the same population dynamic has the opposite 
impact; warming there should reduce insect pop- 
ulation growth rates and thus partly counteract 
the rising crop losses due to increased insect 
metabolism, allowing global rice production lost 
to insects to stabilize for warming exceeding 
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~3°C (Fig. 1B). For maize, the demographic effect 
has only a small net impact on global production 
losses, because this crop is grown in some regions 
where population rates will increase and in other 
regions where population rates will decline, in 
nearly equal measure (Fig. 1C). 

The spatial patterns of modeled changes 
in insect population metabolism also predict 
differential impacts across major geopolitical 
boundaries (Fig. 3). The most substantial yield 
declines will occur in many of the world’s most 
productive agricultural regions, thus reducing 
global grain availability (Fig. 3 and table S5). 
France, the United States, and China—countries 
that produce most of the world’s maize—are 
also among the countries projected to experience 
the largest increases in pest-related crop losses 
(Figs. 1C and 3C). These countries have among 
the highest yields per hectare today (Fig. 3). In 
addition, France and China are responsible for 
a considerable fraction of global wheat and rice 
production, respectively, and are projected to 
suffer large increases in yield loss of these grains 
owing to climate impacts on pests (Figs. 1C and 
3C and table S5). 

Our analysis focuses on the changing impacts 
of insect pests on crop yields with an increase in 
global temperature, accounting for the most robust 
general responses of insect pests to temperature. 
The full scope of physiological and ecological 
impacts is likely to be complex and sensitive to 
particular crop-pest interactions for which more 
physiological data will be needed, especially 
among tropical pest species (fig. S1). These inter- 
actions will occur in conjunction with direct 
plant responses to warming and rising CO, levels, 
which, for the three major crops that we con- 
sidered, are predominantly negative (7). However, 
scenarios with added or alternative biological 
dynamics, such as thermoregulation by insects 
(18) or increased diapause mortality with warm- 
ing (19), suggest that the dominant patterns 
described here are robust (figs. S5 and S6), and 
species-specific predictions for pests that affect these 
three crops generally agree with our predictions (13). 

Agricultural practices will shift as the cli- 
mate warms. Changes in planting dates, cultivar 
use, and planting locations are already under 
way (20) and will become more pronounced 
as the rate of climate warming increases (21). 
Our results suggest that farmers will need to 
make additional changes, such as introducing 
new crop rotations, to maintain yields in the 
face of rising insect pest pressure. In inten- 
sive agricultural environments, adaptation mea- 
sures may involve greater pesticide use, at the 
cost of associated health and environmental 
damage and the elevated threat of pesticide 
resistance. Without wider attention to how cli- 
mate warming will affect crop breeding and 
sustainable pest management strategies, insect- 
driven yield losses will result in reduced global 
grain supplies and higher staple food prices. 
Poor grain consumers and farming households, 
who account for a large share of the world’s 800 
million people living in chronic hunger (9), will 
suffer most. 
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Fig. 2. Projected geographic 
pattern of change in crop 
yield losses to insect pests 
in a 2°C-warmer climate. Results 
are mapped for the fractional 
(percent) increase in crop 

yield loss owing to pests from 
both metabolic and demographic 
effects (AL/L) for (A) wheat, 
(B) rice, and (C) maize. 

The zonal median change is 
plotted for the separate 
contribution of demographic 
effects (An/n, blue) and 
metabolic effects (AM/M, red) 
for (D) wheat, (E) rice, 

and (F) maize. Results are 
shown for a range of life 
history traits in the 
ongitudinal average 

curves (right panels). The 
metabolic effect uses activation 
energies (E,) with a mean 
(0.65 eV) and standard deviation 
+0.15 eV) for insects (12). 

The demographic effect 
assumes a range of $, values 
from 0.0001 to 0.01. All 

results are averaged over 
multiple climate models (13), 

in all years when the global 
mean surface temperature is 

2 + 0.1°C greater than in the 
late 20th century. 


Fig. 3. Predicted regional 
increases in crop losses to 
insect pests in a 2°C-warmer 
climate. The change in future 
yield loss for each country 

is shown for the median grid 
cell within each country and 
plotted as a function of its 
median present-day crop 

yield per unit of planted area 
for (A) wheat, (B) rice, and 

(C) maize. The symbol size is 
scaled to total current 
production for each country, 
and color indicates the United 
Nations region. For each 

crop, the five countries with 
the highest current production 
are labeled and circled. The 
geographic burden of additional 
future production losses 

is shown in the pie charts. 

A full list of effects by region 
and country can be found 

in tables S1 to S5. 
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Past and future global transformation 
of terrestrial ecosystems under 
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Impacts of global climate change on terrestrial ecosystems are imperfectly 
constrained by ecosystem models and direct observations. Pervasive ecosystem 
transformations occurred in response to warming and associated climatic changes 
during the last glacial-to-interglacial transition, which was comparable in 

magnitude to warming projected for the next century under high-emission 

scenarios. We reviewed 594 published paleoecological records to examine compositional 
and structural changes in terrestrial vegetation since the last glacial period and 

to project the magnitudes of ecosystem transformations under alternative future 
emission scenarios. Our results indicate that terrestrial ecosystems are highly 
sensitive to temperature change and suggest that, without major reductions in 
greenhouse gas emissions to the atmosphere, terrestrial ecosystems worldwide are 
at risk of major transformation, with accompanying disruption of ecosystem services 


and impacts on biodiversity. 


errestrial ecosystem function is governed 

largely by the composition and physical 

structure of vegetation (/-3), and climate 

change impacts on vegetation can potentially 

cause disruption of ecosystem services and 
loss of biodiversity (4, 5). It is critical to assess 
the likely extent of ecosystem transformation 
as global greenhouse gas (GHG) emissions in- 
crease (6) and to understand the full potential 
magnitude of impacts should current GHG 
emission rates continue unabated. 

Ecosystem transformation generally involves 
the replacement of dominant plant species or 
functional types by others, whether recruited 
locally or migrating from afar. Observations from 
around the globe indicate that current climate 


Fig. 1. Vegetation differences between 
the LGP and the present. Each square 
represents an individual paleoecological 
site. The color density indicates the magnitude 
of estimated vegetation change since the 
LGP (21,000 to 14,000 yr B.P.). Background 
shading denotes the estimated temperature 
anomaly between the LGM 21,000 years 
ago and today on the basis of assimilated 
proxy-data and model estimates (27). 

(A) Composition. (B) Structure. 
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change may already be driving substantial changes 
in vegetation composition and structure (3). Eco- 
system change is accelerated by mass mortality 
of incumbent dominants (7, 8), and widespread 
dieback events and other large disturbances are 
already under way in many forests and wood- 
lands (9-11), with further mortality events pre- 
dicted under increasing temperatures and drought 
(3, 9, 10, 12). Replacement of predisturbance 
dominants by other species and growth forms 
has been widely documented (8, 13, 14). In addi- 
tion, evidence is accumulating for geographic 
range shifts in individual species, and climate 
change is interacting with invasive species, fire 
regimes, land use, and CO, increase to drive 
vegetation changes in many regions (15, 16). 


A Composition 


ae 


BLarge Moderate 


31 August 2018 


B Structure 


Beyond observations of recent and ongoing 
change, models indicate ecosystem transforma- 
tion under climate projections for the 21st 
century. These include dynamic global veg- 
etation models (3, 17), species distribution 
models (78), and comparison of the multivariate 
climate distance between biomes with that be- 
tween modern and future climates (19). However, 
the capacity for assessing the magnitudes of 
ecosystem transformation under future climate 
scenarios is limited by the difficulty of evaluat- 
ing model performance against empirical records, 
particularly when projected climate states are 
novel (19, 20). 

Paleoecological records of past ecological re- 
sponses to climate change provide an independent 
means for gauging the sensitivity of ecosystems 
to climate change. High-precision time-series 
studies indicate that local and regional ecosystems 
can shift rapidly, within years to decades, under 
abrupt climate change (21-23), but sites with 
such detailed chronologies are scarce. In this 
study, we used published reports to compile 
a global network of radiocarbon-dated paleo- 
ecological records of terrestrial vegetation com- 
position and structure since the Last Glacial 
Maximum (LGM), ~21,000 years before the pres- 
ent (yr B.P.) (24). Most postglacial warming 
happened 16,000 to 10,000 yr B.P., although it 
commenced earlier in parts of the Southern 
Hemisphere (25, 26). Global warming between 
the LGM and the early Holocene (10,000 yr B.P.) 
was on the order of 4 to 7°C, with more warm- 
ing over land than oceans (26, 27). These esti- 
mates are roughly comparable to the magnitude 
of warming that Earth is projected to undergo 
in the next 100 to 150 years if GHG emissions 
are not reduced substantially (28). The magni- 
tudes of changes in vegetation composition and 
structure since the last glacial period (LGP) 
provide an index of the magnitude of ecosystem 
change that may be expected under warming 
of similar magnitude in the coming century 
(29). Although the rate of projected future glob- 
al warming is at least an order of magnitude 
greater than that of the last glacial-to-interglacial 
transition (26), a glacial-to-modern compari- 
son provides a conservative estimate of the ex- 
tent of ecological transformation to which the 
planet will be committed under future climate 
scenarios. 

We reviewed and evaluated paleoecological 
(pollen and macrofossil) records from 594 sites 
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LGM temperature anomaly (°C) 
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Compositional Change 


worldwide (fig. S1), all drawn from peer- 
reviewed published literature, to determine the 
magnitude of postglacial vegetation change. 
We adopted an expert-judgment approach in 
which paleoecologists with relevant regional ex- 
perience compiled published records (table S1); 
reviewed the data, diagrams, and accompany- 
ing papers; and inferred the composition and 
structure of the glacial-age and Holocene veg- 
etation at each site (24). For the purposes of 
our analyses, we defined the LGP as the interval 
between 21,000 and 14,000 yr B.P. Although 
postglacial warming was under way in many 
regions by 16,000 yr B.P. (25), continental ice 
sheets were still extensive 14,000 yr B.P., and 
some climate regimes remained essentially 
“glacial” in nature, particularly in the Northern 
Hemisphere (30). Extending the LGP window 
to 14,000 yr B.P. provides a larger array of rec- 
ords for the assessment, both in glaciated and 
unglaciated terrains, and renders our analysis 


Structural Change 


more conservative (climatic and vegetation con- 
trasts with the Holocene are likely to decrease 
between 21,000 and 14,000 yr B.P.). 

For each record, experts were asked to clas- 
sify the magnitudes of compositional change and 
structural change since the LGP as large, mod- 
erate, or low and to provide detailed justifica- 
tion for their judgments (24) (table $2). This 
placed all the diverse records into a common 
framework for comparison. For sites that experi- 
enced moderate to large ecological change, 
experts were also asked to assess the role of 
climate change (large, moderate, or none) in 
driving the observed vegetation change. For 
each of these four judgments, experts were asked 
to state their level of confidence as high, medium, 
or low. In assessing the role of climate change, 
experts were asked to focus specifically on wheth- 
er climate change since the LGP was sufficient 
to drive the observed changes, acknowledging 
that other factors (e.g., human activity, postglacial 


Fig. 2. Estimated temperature differences 
for different categories of vegetation 
response. Box plots of the estimated 
mean annual temperature differences 
between the LGM and today in each of 
the three vegetation change categories 
(low, moderate, and large) for 

(A) composition and (B) structure. 

Low vegetational changes are associated 
largely with relatively small temperature 
anomalies, whereas moderate and 

large changes are associated 

with larger post-LGM temperature 
differences, indicating that the 
magnitude of temperature change 
plays an important role in the 
magnitude of vegetation change. 

The glacial temperature anomalies 

are from data in (27). Analyses using 
the TraCE-21ka simulation show 

similar patterns (fig. S4). 


CO, increase, and megafaunal dynamics) may 
have also played important roles. For sites with 
a long history of human land use, experts used 
Holocene records predating widespread land 
clearance as a benchmark for comparison with 
the LGP records. 

Our results indicate that the magnitude of 
past glacial-to-interglacial warming was suffi- 
cient at most locations across the globe to drive 
changes in vegetation composition that were 
moderate (27% of sites) to large (71%), as well as 
moderate (28%) to large (67%) structural changes 
(Fig. 1 and table S3). These changes were par- 
ticularly evident at mid- to high latitudes in the 
Northern Hemisphere, as well as in southern 
South America, tropical and temperate south- 
ern Africa, the Indo-Pacific region, Australia, 
Oceania, and New Zealand (Fig. 1A). Com- 
positional change at most sites in the Neo- 
tropics was moderate to large, but three sites 
showed little or no compositional change, all 
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Fig. 3. Estimated vegetation change under future climate scenarios. (A) Box plots 

of the estimated mean annual temperature differences between today and future climate 
simulations for individual sites (as determined by using the nearest grid point). Most 

sites show relatively small temperature change under the low-emission scenario (RCP 2.6), 
with substantially greater change under the high-emission scenario (RCP 8.5). (B) Probabilities 
of large changes in vegetation composition and structure as a function of temperature 
change. (C to F) Estimated probabilities of large compositional and structural changes by 
the end of the 21st century (the average of the period from 2081 to 2100) under RCP 2.6 
(C and E) and RCP 8.5 (D and F). Probabilities (B to F) are estimated from a logistic spline 
regression model fit by using LGM-to-modern temperature change as a predictor variable 
and observed LGP-to-modern vegetation changes (large versus not large) as the response 
variable. Future temperature increases are calculated as an average for 2081 to 2100 
under the model scenarios, minus an average for 1985 to 2005 from the CCSM4 historical 
simulation. Analyses using the TraCE-21ka simulation show similar patterns (fig. S7). 


with medium to high confidence (fig. S2). Shifts 
in vegetation structure were also moderate to 
large at mid- to high-latitude sites, although a 
few sites showed low change (Fig. 1B). The Neo- 
tropics had nine sites with little or no structural 
change (Fig. 1B), all with high-confidence assess- 
ments (fig. S2). These sites have been occupied 
by tropical forest ecosystems since the LGM, 
although most have undergone moderate to 
large compositional change (3/, 32). For nearly 
all sites that experienced moderate or large eco- 
logical change, climate change since the LGP 
was judged to be sufficient to explain the ob- 
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served changes with high confidence (table S4). 
Atmospheric CO. concentrations also increased 
from 190 to 280 parts per million during the 
deglaciation, interacting with and in some cases 
modulating ecological responses to climate 
change. However, CO» changes alone cannot 
account for postglacial vegetation changes (sup- 
plementary text). 

Independently of the expert-judgment pro- 
cess, we used the estimated anomaly in mean 
annual temperatures between the LGM and 
the present (preindustrial) as a proxy for the 
overall magnitude of climate change since the 
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LGP (24). LGM temperature estimates were 
derived using an assimilated data-model in- 
tegration (27). Low-change sites were largely 
concentrated in regions where the estimated 
temperature anomaly was relatively small (Fig. 1). 
To explore this relationship further, we plotted 
the frequency distribution of the difference 
between estimated LGM and present-day mean 
annual temperatures for individual sites in 
each of the three ecological-response categories. 
Nearly all sites with low compositional change 
between the LGP and today are associated with 
small estimated temperature anomalies (median, 
2.4°C), whereas sites with moderate to high 
compositional change have larger temperature 
anomalies (Fig. 2A). Results for structural changes 
are similar, although a greater number of sites 
with low structural change include larger tem- 
perature anomalies (Fig. 2B). This difference 
is not surprising, because compositional change 
in vegetation can occur without an accompany- 
ing change in vegetation structure (Fig. 1). Europe 
and eastern North America experienced un- 
usually large temperature changes since the 
LGM, owing to depressed temperatures near 
the large ice sheets, and these regions show 
substantial compositional and structural changes 
since the LGP. However, results from other parts 
of the globe indicate that widespread ecosystem 
changes were driven by much smaller temper- 
ature changes (fig. S3). We repeated our anal- 
ysis using the TraCE-21ka model simulations 
(33, 34), which yield a lower magnitude of LGP- 
to-Holocene climate change (35); despite the 
potential conservative bias, results for com- 
positional and structural changes (fig. S4) were 
similar to those in Fig. 2. Temperature dif- 
ferences between the LGP and the present were 
substantially greater for sites with large eco- 
logical change than for those with low to mod- 
erate change, by both paleoclimate estimates 
(27, 33) (table S2). 

We also used our database of ecological change 
since the LGM to assess the global distribution 
of the probabilities of large compositional and 
structural changes given GHG emission scenar- 
ios [representative concentration pathways (RCPs) 
2.6, 4.5, 6.0, and 8.5, each as simulated by the 
Community Climate System Model version 4 
(CCSM4)] (24, 36). The range of LGM-to-present 
temperature changes (Fig. 2) overlaps with the 
range of temperature changes projected for the 
coming century under these scenarios (Fig. 3A 
and fig. S5). We quantified the relationship be- 
tween temperature and ecological change by 
using a logistic spline regression with ordered 
categories (37). We fit models for compositional 
and structural change by using the temperature 
change since the LGM as the independent pre- 
dictor variable. In both models, LGM-to-modern 
temperature change is a significant predictor of 
ecosystem change (P < 0.001). We then used 
these models to predict the risk of large change 
for the future range of projected global temper- 
ature changes (Fig. 3B) and to map the prob- 
ability of large change under RCP 2.6 and RCP 
8.5 (Fig. 3, C to F) at the end of the 21st 


3 of 4 


8102 ‘og IsnBny uo /610'beweoualos’e0ue!0s//:d}]y Wody pepeojuMOGg 


RESEARCH | REPORT 


century (see fig. S6 for RCP 4.5 and RCP 6.0). 
Under RCP 2.6, the probability of large compo- 
sitional change is less than 45% over most of 
the globe (Fig. 3C) and the probability of large 
structural change is generally less than 30% 
(Fig. 3E). By contrast, under the business-as- 
usual emissions scenario, RCP 8.5, the proba- 
bilities of large compositional change and large 
structural change are both greater than 60% 
(Fig. 3, D and F). Analyses using the TraCE- 
21ka model yielded similar patterns (fig. S7). 

Our study uses a single variable, mean annual 
temperature, as a metric for the broader array 
of climatic changes that can drive vegetation 
change, and it compares vegetation and climate 
states separated by 10,000 to 20,000 years. Fu- 
ture climate change, like that in the past, will 
be multivariate, involving shifts in seasonal tem- 
peratures, seasonal precipitation, climate extremes, 
and variability regimes. As mean annual tem- 
perature increases, other ecologically important 
variables will change, often in complex or counter- 
intuitive ways (20, 38, 39), and ecological responses 
will often be episodic or nonlinear (8, 13-15). 
Although the temperature increases since the 
LGP provide crude analogs for ongoing and 
future climate changes—for example, boundary 
conditions and forcings are different now 
(26, 40, 41)—our results nevertheless provide 
concrete evidence that vegetation composition 
and structure are sensitive to changes in mean 
annual temperature of the magnitudes forecast 
for the coming century and that vegetation 
transformations will become increasingly exten- 
sive as temperatures increase. Under the RCP 
8.5 scenario, the rate of warming will be on the 
order of 65 times as high as the average warm- 
ing during the last deglaciation (26). Further- 
more, the warming between the LGP and the 
Holocene occurred within the range of previous 
glacial and interglacial temperatures, whereas 
projected future changes will exceed those ex- 
perienced over the past 2 million years (26). 
Although many ecological responses (e.g., species 
migration, colonization, and succession) will 
likely lag behind climate changes, ecosystem 
transformations will often be accelerated by 
disturbance and mortality events, land use, and 
invasive species (7-15). 

We therefore conclude that terrestrial veg- 
etation over the entire planet is at substantial 
risk of major compositional and structural 
changes in the absence of markedly reduced 
GHG emissions. Much of this change could 
occur during the 21st century, especially where 
vegetation disturbance is accelerated or amp- 
lified by human impacts (7). Many emerging 
ecosystems will be novel in composition, struc- 
ture, and function (42), and many will be 
ephemeral under sustained climate change; 
equilibrium states may not be attained until 
the 22nd century or beyond. Compositional 
transformation will affect biodiversity via dis- 
integration and reorganization of communities, 
replacement of dominant or keystone species, 
pass-through effects on higher trophic lev- 
els, and ripple effects on species interactions (J6, 43). 


Nolan et al., Science 361, 920-923 (2018) 


Structural transformation will have particularly 
large consequences for ecosystem services (4), 
including the achievement of nature-based 
development solutions under the United Na- 
tions’ Sustainable Development Goals (44). 
Structural changes will also influence bio- 
diversity, driving alterations in habitats and 
resources for species at higher trophic levels. 
Compositional and structural changes may 
also induce potentially large changes to car- 
bon sources and sinks, as well as to atmo- 
spheric moisture recycling and other climate 
feedbacks. Our results suggest that impacts 
on planetary-scale biodiversity, ecological func- 
tioning, and ecosystem services will increase 
substantially with increasing GHG emissions, 
particularly if warming exceeds that projected 
by the RCP 2.6 emission scenario (1.5°C). 
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GENOME STRUCTURE 


Three-dimensional genome structures 
of single diploid human cells 


Longzhi Tan’?*, Dong Xing'*, Chi-Han Chang’, Heng Li?, X. Sunney Xie”*”*+ 


Three-dimensional genome structures play a key role in gene regulation and cell functions. 
Characterization of genome structures necessitates single-cell measurements. This has been 
achieved for haploid cells but has remained a challenge for diploid cells. We developed a single- 
cell chromatin conformation capture method, termed Dip-C, that combines a transposon-based 


whole-genome amplification method to detect many chromatin contacts, called META 
(multiplex end-tagging amplification), and an algorithm to impute the two chromosome 
haplotypes linked by each contact. We reconstructed the genome structures of single diploid 
human cells from a lymphoblastoid cell line and from primary blood cells with high spatial 
resolution, locating specific single-nucleotide and copy number variations in the nucleus. The 
two alleles of imprinted loci and the two X chromosomes were structurally different. Cells of 
different types displayed statistically distinct genome structures. Such structural cell typing is 


crucial for understanding cell functions. 


he nucleus of a human diploid cell contains 

46 chromosomes, 23 maternal and 23 pa- 

ternal, together carrying 6 Gb of genomic 

DNA. The three-dimensional (3D) genome 

structure is thought to be crucial for the 
regulation of gene expression and other cellular 
functions (7). For example, the nuclei of sensory 
neurons assume unusual architectures in the 
mouse visual (2) and olfactory systems (3). Chro- 
matin conformation capture assays, such as 3C 
(4) and Hi-C (5), allow for studies of 3D genome 
structures in bulk samples through proximity 
ligation of DNA (6). However, the difference be- 
tween cells can only be observed by single-cell 
measurements. Single-cell chromatin conforma- 
tion capture methods avoid ensemble averaging 
(7-12) and have yielded 3D genome structures of 
haploid mouse cells (10, 17). However, character- 
izing the 3D genome structures of diploid mam- 
malian cells remains challenging (73). Here, we 
used an improved chromatin conformation cap- 
ture method and phased (haplotype-resolved) 
single-nucleotide polymorphisms (SNPs) to dis- 
tinguish between the two haplotypes of each 
chromosome. This allowed us to examine the 
cell type dependence of 3D genome structures 
of diploid cells. 

Obtaining high-resolution 3D genome struc- 
tures of single diploid cells requires resolving a 
large number of chromatin “contacts”—pairs of 
genomic loci that are joined by proximity liga- 
tion. We developed a chromatin conformation 
capture method, termed Dip-C (Fig. 1A), that can 
detect more contacts than existing methods 


Fig. 1. Single-cell chromatin conformation capture and haplotype imputation 
by Dip-C. (A) Schematics of the chromatin conformation capture protocol. The 

3D information of chromatin structure was encoded in the linear genome through 
proximity ligation of chromatin fragments, as in 3C (4) and Hi-C (5, 19). The 
ligation product was then amplified by META (15) and sequenced. Colors represent 
genomic coordinates. Note that ligation products may be linear (illustrated here) 1 
or circular (not shown). (B) Imputation of the two chromosome haplotypes linked 3 
by each chromatin “contact” (red dots) in a representative single cell. 
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with minimal false positives. In particular, we 
omitted biotin pulldown (8, 9) and conducted 
high-coverage whole-genome amplification with 
multiplex end-tagging amplification (META), 
which introduced few artifactual chimeras (14, 15). 
We detected a median of 1.04 million contacts 
per single cell (n = 17, minimum = 0.71 million, 
maximum = 1.48 million) from GM12878, a 
female human lymphoblastoid cell line, and a 


median of 0.84 million contacts (n = 18, mini- 
mum = 0.67 million, maximum = 1.08 million) 
from peripheral blood mononuclear cells (PBMCs) 
of a male human donor (J6). This exceeds the 
medians achieved with existing methods by a 
factor of ~5 (fig. S4 and table S1). Most cells were 
in the G, or Gp phase of the cell cycle. In addition, 
we simultaneously detected copy number varia- 
tions (CNVs), losses of heterozygosity (LOHs), 
DNA replication, and V(D)J recombination with 
a 10-kb bin size (figs. S2 and S3). 

Another challenge in reconstructing diploid 
genomes is to determine which haplotypes are 
involved in each chromatin contact (17-20) 
(table S1). To assign haplotypes, we developed an 
imputation algorithm (Fig. 1B). We reasoned that 
unknown haplotypes can be imputed from “neigh- 
boring” (in terms of genomic distances) contacts 
by assuming that the two homologs would typ- 
ically contact different partners. Using a statisti- 
cal property of interchromosomal and long-range 
intrachromosomal contacts (15), we defined a 
contact neighborhood as a superellipse with an 
exponent of 0.5 and a radius of 10 Mb, where 
haplotypes of nearby contacts were weighted 
in imputing the haplotypes of each contact (fig. 
87). In the Dip-C algorithm, after removing 3C/ 
Hi-C artifacts [contacts with few neighbors (77)] 
and initial imputation, haplotypes can be option- 
ally refined through a series of draft 3D models 
(15) (fig. S5). Imputation accuracy was estimated 
to be ~96% for each haplotype by cross-validation 
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(15) (table S1). Regions harboring CNVs or LOHs, 
as well as an apparently damaged GM12878 cell, 
were excluded from reconstruction (table S1). 
We reconstructed the 3D diploid human ge- 
nomes at 20-kb resolution. Reconstruction was 
successful without supervision for 94% (15 of 16) 
of the GM12878 cells and 67% (12 of 18) of the 
PBMCs, and after removal of small problematic 
regions for 6% (1 of 16) of the GM12878 cells 
and 22% (4 of 18) of the PBMCs (table S1 and 
fig. S8) (75). Note that because chromatin con- 
formation capture—the process of converting 
3D coordinates to chromatin contacts—is in- 
trinsically lossy and noisy, our 3D structures 
harbored additional uncertainties including per- 
turbations of chromatin structures during the 
experiments, inaccuracies in the energy function 
used by 3D modeling, and nuclear volumes in- 
accessible to DNA sequencing (e.g., centromeres, 
nucleoli, and nuclear speckles). These uncertain- 
ties are common to all 3C/Hi-C studies and are 
difficult to estimate, and imputation may be less 
successful when two homologs are nearby or adopt 
similar shapes. Therefore, other problematic 
regions might persist even after manual removal. 
Figure 2A shows a representative cell. Each 
particle, displayed as a colored point, represents 
20 kb of chromatin, or a radius of ~100 nm. A 
lower bound for reconstruction uncertainty was 
estimated from the median deviation of ~0.4 par- 
ticle radii (~40 nm) across all 20-kb particles 
between three replicates (fig. S9 and table S1). 
Well-known nuclear morphologies were observed 
in an M/G,-phase GM12878 cell, where chromo- 
somes retained their characteristic V shapes after 
recent mitosis, and in several PBMCs, where 
multiple nuclear lobes were reminiscent of the 
partially segmented nuclei of low-density neu- 
trophils and other blood cell types (Fig. 2B). 
We also used published data on mouse em- 
bryonic stem cells (mESCs) (J0) to reconstruct 
3D diploid mouse genomes despite fewer con- 
tacts (~0.3 million per cell, or ~0.2 million under 
our definition) (table S1), because the mouse line 
harbored more SNPs than humans (15). 
Similar to previously described haploid mouse 
genomes (10, 12), the diploid human genomes 
exhibited chromosome territories (Fig. 2A) and 
chromatin compartments [visualized by CpG fre- 
quency as a proxy (21)], with the heterochromatic 
compartment B (5) concentrated at the nuclear 
periphery and around foci in the nuclear center 
(Fig. 2C). Spatial clustering of DNA sequences 
with similar CpG frequencies suggests a corre- 
lation between primary sequence features and 
3D genome folding (7). 
Our 3D structures revealed different radial 
preferences across the human genome (black dots 
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Fig. 2. 3D genome structures of single diploid human cells. (A) 3D genome structure of a 
representative GM12878 cell. Each particle represents 20 kb of chromatin, or a radius of ~100 nm. 

(B) Peculiar nuclear morphology in a cell that recently exited mitosis (top) and in a cell with multiple 
nuclear lobes (bottom). (C) Serial cross sections of a single cell showing compartmentalization of 
euchromatin (green) and heterochromatin (magenta), visualized by CpG frequency as a proxy (21). 

(D) Radial preferences across the human genome, as measured by average distances to the nuclear 
center of mass. Our results (black dots, smoothed by 1-Mb windows) agree well with published DNA FISH 
data (gray lines) on whole chromosomes (22) (shifted and rescaled) and provide fine-scale information. 
Lower and upper axis limits were 20 and 50 particle radii, respectively, for the black dots. GM12878 

cell 4 (extensive chromosomal aberrations) and cell 16 (M/G; phase) were excluded. (E) Example radial 
preferences of two chromosomes. The gene-rich chromosome 19 preferred the nuclear interior (left), 
whereas the gene-poor chromosome 18 almost always resided on the nuclear surface (right). (F) Stochastic 
fractal organization of chromatin was quantified by a matrix of radii of gyration of all possible subchains of 
each chromosome (heat maps). We identified a hierarchy of single-cell domains across genomic scales 
(black trees). A subtree was simplified as a black triangle if either of its two subtrees was below a certain size 
(from left to right: 10 Mb, 2 Mb, 500 kb, 100 kb). In each panel, the region from the previous panel is 
shown in transparent gray. In the rightmost panel, thick sticks (top) and circles (bottom) highlight the 
formation of a known CTCF loop (19). Spheres with arrows (top) indicate the positions and orientations of 
the two converging CTCF sites. Genomic coordinates are for the human genome assembly hg19. 
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in Fig. 2D). Our results agree well with whole- 
chromosome painting data by DNA fluorescence 
in situ hybridization (FISH) (22) (gray lines in 
Fig. 2D). Both methods show that the gene-rich 
chromosome 19 prefers the nuclear interior, while 
the gene-poor chromosome 18 prefers the nu- 
clear periphery (Fig. 2E). Within each chromo- 
some, different segments could have distinctly 
different radial preferences, which were corre- 
lated with chromatin compartments (fig. S11A). 
For example, the CpG-rich euchromatic end (left) 
of chromosome 1 was heavily biased toward the 
nuclear center, whereas some other regions on 
the same chromosome were biased toward the 
nuclear periphery (Fig. 2D). Such fine-scale 
information cannot be obtained from whole- 
chromosome painting (22, 23) experiments. 
Our Dip-C results provide a holistic view of the 
stochastic, fractal organization of chromatin 
across different genomic scales. Bulk Hi-C sug- 
gests that chromatin forms a “fractal globule” 
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Cell 5 2 9 14 15 Cell 3 


eB 


we 
aed ¢ sw 


with compartments (5, 79) and domains such 
as topologically associating domains (TADs) (24) 
and CCCTC-binding factor (CTCF) loop domains 
(19). However, such fractal organization has not 
been visualized in single human cells in a genome- 
wide manner. We observed spatial clustering 
(globules) and segregation (insulation) of con- 
secutive chromatin particles along each chromo- 
some (Fig. 2F, upper panels). Such organization 
could be quantified by a matrix of radii of gy- 
ration of all possible subchains in each chromo- 
some (Fig. 2F, lower panels). Single-cell domains 
could then be identified as squares that had 
relatively small radii [partly similar to (8)] (75). 
We found single-cell domains across all genomic 
scales and therefore identified them through 
hierarchical merging, yielding a tree of domains 
[partly similar to (25, 26) in bulk Hi-C] (Fig. 2F). 
On the smallest scale, some domains coincided 
with CTCF loop domains from bulk Hi-C (19) 
(rightmost panels in Fig. 2F). Single-cell do- 
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Fig. 3. Distinct 3D structures of the maternal and paternal alleles. (A) Structural difference 
between the two alleles of the imprinted H19//GF2 locus. Despite cell-to-cell heterogeneity, the 
maternal allele more frequently separated IGF2 from both H19 and the nearby HIDAD site and 
disrupted the IGF2-HIDAD CTCF loop (white and red circles). Spheres highlight three CTCF sites from 
bulk Hi-C. Heat maps show the root-mean-square average pairwise distances between all 20-kb 
particles. Haplotype-resolved bulk Hi-C (black heat map with 25-kb bins) is adapted from figure 7C of 
(19). (B) Active (red) and inactive (blue) X chromosomes prefer extended and compact morphologies, 
respectively, as shown by cross sections of two representative cells. (C) Individual active and inactive 
X chromosomes can be distinguished by PCA of single-cell chromatin compartments, defined for 

each 20-kb particle as the average CpG frequency of nearby (within 3 particle radii) particles. (D) The 
inactive X chromosome tends to form the previously reported “superloops,” 27 very-long-range (5 to 
74 Mb) chromatin loops identified by bulk Hi-C (19, 20, 29). Superloops are sorted by size. 

(E) Haplotype-resolved contact maps (red dots) and 3D structures of the two X chromosomes in an 
example cell. Black circles denote all superloops (19). White spheres denote four example superloop 
anchors (DXZ4, x75, ICCE, and FIRRE). GM12878 cells 4 and 16 are excluded from (C) and (D). 
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mains were highly heterogeneous between cells, 
frequently breaking and merging bulk domains 
(fig. S19), consistent with a recent study on tet- 
raploid mouse cells (8). 

Traditional methods such as bulk Hi-C and 
two-color DNA FISH are pairwise measurements 
and thus cannot study multichromosome inter- 
mingling. In our 3D models, we quantified mul- 
tichromosome intermingling by the diversity of 
chromosomes (Shannon index) near each 20-kb 
particle (fig. S20A), revealing genomic regions 
that frequently contacted multiple chromosomes 
(fig. S20B). These regions were similar between 
the human cell types despite their different aver- 
age extents of intermingling (fig. S10), and they 
were mostly euchromatic (CpG-rich) (fig. SIIB) 
for two reasons: (i) Euchromatin more frequent- 
ly resided on the surface of chromosomes than 
did heterochromatin [consistent with (7)] (fig. 
S11D), and (ii) even when heterochromatin re- 
sided on the surface, it tended to face the nuclear 
periphery (72) (fig. SILA) and thus had no part- 
ners to intermingle with. The intermingling 
regions partially overlapped with “hubs” identi- 
fied by a recent report (27). 

We examined the structural relationship be- 
tween the maternal and paternal alleles, which 
can only be studied in diploid cells. Our data 
captured the structural difference between the 
two alleles caused by genomic imprinting. At im- 
printed loci, the two alleles can differ drastically 
in transcriptional activity (28). Near the mater- 
nally transcribed H19 gene and the paternally 
transcribed [GF2 gene, bulk Hi-C identified dif- 
ferent contact profiles and different use of CTCF 
loops between the two homologs (19). We di- 
rectly visualized this ~0.6-Mb region in single 
cells (Fig. 3A). Despite cell-to-cell heterogeneity, 
the maternal allele more frequently separated 
IGF2 from both H79 and the nearby HIDAD 
site and disrupted the IGF2-HIDAD CTCF loop, 
whereas the paternal allele more frequently 
stayed fully intermingled. 

X chromosome inactivation (XCI) presents a 
striking example of the difference between two 
homologs (28). As expected, we found in the 
female GM12878 cell line that the active X chro- 
mosome [the maternal allele based on RNA 
expression (15)] tended to exhibit an extended 
morphology, and the inactive X a compact one 
(Fig. 3B), although in some cells this morpho- 
logical difference was not obvious. More con- 
sistently, the two X chromosomes in each cell 
were characterized by their distinct patterns of 
chromatin compartments. The active X featured 
clear compartmentalization of euchromatin and 
heterochromatin, resembling that of the male X 
(in PBMCs); in contrast, compartments along 
the inactive X were more uniform (fig. S12E). 
Individual X chromosomes could be clearly sep- 
arated into active and inactive clusters by prin- 
cipal components analysis (PCA) of single-cell 
compartments (Fig. 3C). Our conclusion held 
if single-cell compartments were defined on 
the basis of contacts [partly similar to (J0)] 
rather than 3D structures (fig. S15, A and B). 
We also visualized the simultaneous formation 
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Fig. 4. Cell type-specific chromatin structures. 

(A) Quantification of the organization of centromeres 
and telomeres. The mESCs exhibit stronger Rabl 
configuration (horizontal axis; the length of summed 
centromere-to-telomere vectors normalized by the 
total particle number, which differs between human 
and mouse; axis limit = 0.005 particle radii), whereas 
the PBMCs tend to point centromeres outward relative 
to telomeres (vertical axis; the summed centromere-to- 
telomere difference in distance from the nuclear center 
of mass normalized by the total particle number; axis 
limit = 0.007 particle radii). Each marker represents 

a single cell and was inferred by V(D)J recombination 
in PBMCs (table S1 and fig. S3B). (B) Quantification of 
chromosome intermingling (vertical axis; the average 
fraction of nearby particles that are not from the same 
chromosome) and chromatin compartmentalization 
horizontal axis; Spearman correlation between each 
particle's own CpG frequency and the average of 
nearby particles). (C) Example cross sections of three 
cell types, colored according to chromosome (left) 

or by the multichromosome intermingle index (right). 
D) Among the human cells, four cell type clusters 
shaded)—B lymphoblastoid cells, presumable 

T lymphocytes, B lymphocytes, and presumable 
monocytes/neutrophils (PBMC cells 9, 14, and 
18)—could be distinguished from the differential 
formation (defined as end-to-end distance < 3 particle 
radii) of known cell type-specific promoter-enhancer 
loops from published bulk promoter capture Hi-C 

(35). (E) The same four clusters could also be distin- 
guished by unsupervised clustering via PCA of single-cell 
chromatin compartments, without the need for bulk 
data. The two alleles of each locus were treated as two 


different loci. GM12878 cell 16 was excluded from (D) and 


(E). (F) An example region that was differentially com- 
partmentalized between two cell types (black, B lympho- 
blastoid cells; red, presumable T lymphocytes). Right 
panels visualize the configuration of the ~0.5-Mb region 


(chr 13: 62.5 to 63 Mb, thick yellow sticks) with respect to 


the rest of the genome (transparent, colored by CpG 


frequencies) in two representative cells. Only the paternal 


alleles are shown. Bulk Hi-C (black heat map with 50-kb 
bins) is from (19, 41). GM12878 cell 4 was excluded. 


Cross Section 


Centromeres _\ 
Facing Out V| & 
so 
+ % » 
°GM12878 | °*% % 
ort B oO 
= 
aa * 7 
olo Unknown ° 
e< 
M/G1 
Telomeres Ne 
Facing Out _| Cen RM Tel 
More More Parallel 
Random (Rabl) Intermingle 
3 Chr Index 
BE 60% C cross Section 1BMMIX/Y 0.3 Ml—l1.9 
fe} 
of eo ¢ ise 
£D +s DX 
2 |. ° “6 
€ M/G1 = 
= 0% oO 
. 0.55 0.72 ° 


p (Self CpG, Nearby CpG) 


D Ff 45% Monocyte/Neutrophil 
a ORO 
So ° 
oS 
35 F e 
ag B Lymphoblastoid vce 
3 E °°? 
se T Lymphocyte - . @ 
c a R x 
3 ‘S eRe B Lymphocyte 
2 x 
15% 
9 Loy 
10% B-lymphocyte-specific a 
Promoter Loops 
B Lymphoblastoid 
e 
e 
T Lymphocyte ga 
x * e e 
x e 
oe : 
° + 5 * 
BS B Lymphocyte O 
a 
° 
Po 
Monocyte/Neutrophil 


PC 1 


of multiple “superloops” (19, 20, 29) in the in- 
active X chromosome (Fig. 3, D and E). Averaged 
contact matrices of the inactive and active X 
chromosomes agreed well with bulk Hi-C (79) 
(fig. S15, C and D). 

In contrast to XCI, it is unknown whether 
single-cell compartments of two autosomal alleles 
may vary in a coordinated manner. By decom- 
posing the variability of single-cell compartments 
into between-cell and within-cell differences (fig. 
$12A), we found that autosomal alleles fluctuate 
(with respect to their median compartments) al- 
most independently from each other, exhibiting 
on average near-zero Spearman correlation (fig. 
$12D). Our conclusion held if compartments were 
defined on the basis of contacts (fig. S16). 

We can pinpoint genomic changes, such as 
SNPs and CNVs, to their precise spatial locations 
in the cell nucleus. The donor of the GM12878 
cell line carried a heterozygous G-to-A mutation 
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(184244285) in the cytochrome P450 gene CYP2CI19, 
leading to a truncated, nonfunctional protein 
variant CYP2C19*2 and affecting metabolism of 
hormones and drugs (30). Figure S18A shows 
the 3D localization of this drug-response SNP 
on the paternally inherited chromosome 10 of 
a GM12878 cell. In addition to inherited muta- 
tions, single cells also harbor somatic changes. 
In lymphocytes, somatic V(D)J recombination 
generates diversity of immunoglobulins and T cell 
receptors by DNA deletions and inversions. Fig- 
ure S18B shows the 3D localization of two V(D)J 
recombinations at a T cell receptor locus, lead- 
ing to two different DNA deletions on the two 
alleles of chromosome 14 of a T lymphocyte. The 
capability to spatially localize genomic changes 
is important for studying cancers and inherited 
diseases, where mutations can have severe con- 
sequences and may disrupt the chromatin struc- 
ture of nearby regions. 
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We also examined the cell type dependence of 
3D genome structures. Similar to haploid mESCs 
(1D), chromosomes in diploid mESCs preferred the 
Rabl configuration (centromeres pointing toward 
one side of the nucleus and telomeres toward the 
other), albeit to a different extent in each cell (Fig. 4A). 
In contrast, we found the Rabl configuration to 
be weak in most GM12878 cells and PBMCs. 
Most PBMCs pointed their centromeres toward 
the nuclear periphery and telomeres toward the 
nuclear center, consistent with previously reported 
arrangements in human lymphocytes (37). By 
contrast, the M/G,-phase GM12878 cell pointed 
centromeres toward the outer rim of a char- 
acteristic mitotic rosette. 

The overall extent of chromosome intermin- 
gling also differed among the cell types. Chro- 
mosomes tended to intermingle less in mESCs 
and more in PBMCs, with GM12878 intermediate 
between them (Fig. 4, B and C), consistent with 
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previous reports that chromosomes intermingle 
less in the pluripotent mESCs than in terminally 
differentiated fibroblasts (32) and that chromo- 
somes intermingle more in resting human lym- 
phocytes than in activated ones (which resembled 
GM12878) (33). As expected (10, 34), the M/G,- 
phase cell exhibited a low level of chromosome 
intermingling and the lowest level of chromatin 
compartmentalization. 

Cell type-dependent promoter-enhancer loop- 
ing has been suggested to underlie differential 
gene expression (35). Among the human cells, 
differential formation of known cell type-specific 
promoter-enhancer loops [based on cell type- 
purified bulk Hi-C (15, 35)] clearly separated 
the single cells into four cell type clusters: B 
lymphoblastoid cells (GM12878), presumable T 
lymphocytes, B lymphocytes, and presumable 
monocytes/neutrophils (Fig. 4D). Defining loop 
formation on the basis of contacts rather than 3D 
structures yielded similar results (fig. S17A). 

Cell type clusters could be equally well sepa- 
rated in an unsupervised manner, without prior 
knowledge of the cell types. Unlike ensemble- 
averaged structures such as protein crystal struc- 
tures, single-cell 3D genomes are intrinsically 
stochastic and dynamic. Statistical characterization 
such as PCA is necessary to distinguish different 
cell types, in which clusters of single cells corre- 
spond to valleys in a Waddington landscape (36) 
of certain cellular phenotypes. This kind of cell 
typing has been carried out using phenotype 
variables such as single-cell transcriptomes (37) 
and open chromatin regions (38, 39), each of 
which must have underlying structural differ- 
ences in the 3D genome. 

With Dip-C, we are in a position to carry out 
cell typing with genome structure as the sole 
variable. Given the high information content of 
3D structures, many possible features might be 
used in cluster analysis. Here, we chose single- 
cell chromatin compartments as the input var- 
iable of PCA. The four cell type clusters were 
clearly separated (Fig. 4E), with one of the most 
differentially compartmentalized regions shown 
in Fig. 4F. Our conclusion held if compartments 
were defined on the basis of contacts (fig. $17, B 
and C). Previous reports (7, 8, JO-12) had focused 
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on defining the width (or spread) of a single 
Waddington valley, studying, for example, cell 
cycle dynamics within a cell type and domain 
stochasticity within a cell cycle phase. Our PCA 
result, in contrast, highlighted the consistent dif- 
ference among cell types, signifying the separa- 
tion between Waddington valleys. 

Our initial examination of only a handful of 
cell types has clearly shown the tissue depen- 
dence of 3D genome structures. A systematic 
survey of more cell types under various con- 
ditions will likely lead to new discoveries in cell 
differentiation, carcinogenesis, learning and mem- 
ory, and aging. 
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YouTu Computer Vision Summit (TCVS). In this summit, world-renowned scholars, experts and partners 
of Tencent YouTu Lab from all over the world will gather together to share their perspectives on the latest 
technology development, scientific innovation, application and the future trend of computer vision. More- 
over, Tencent YouTu Lab will also announce its recent research achievement and development strategy. 


Time 
2018.9.6 9:30-17:30 (GMT+8) 


Webcast 


Visit for subscription and live stream. 


* Tencent YouTu Lab | Science 
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Department of Defense 
Lg in the Democratic Republic of the Congo 


Ebola Outbreak Response 


By Dr. Clay Holloway & Hannah Feldman, Medical Countermeasure Systems 


A Navy laboratory technician pipettesidtug product for use ina filovirus vaccine. 
Photograph: Navy Visual News Service/MC3 Jake Berenguer 


outbreaks, most notably with the West Africa Ebola outbreak in 

2014, which left more than 11,000 dead and 17,000 survivors, 
many of whom still require post-recovery medical care. The threat 
continues today. Public health officials and business leaders like 
Bill Gates have long warned that the world is not ready for the next 
pandemic. 


| nthe past decade, we have seen an increase in infectious disease 


Global health officials say it's more important than ever for disease 
outbreaks to be stopped at their source before they come full-blown 
epidemics. And that is exactly what the Department of Defense—in 
collaboration with the World Health Organization (WHO), interagency, 
and other international partners—did in response to the latest Ebola 
outbreak in the Democratic Republic of the Congo (DRC). 


On May 8, the Ministry of Health of the DRC notified the WHO of an 
Ebola virus outbreak in Bikoro Health Zone in the northwest Equateur 
Province. A total of 54 cases and 33 deaths were reported with over 
900 identified contacts. The WHO worked closely with the DRC 
government to rapidly scale up its operations and mobile health 
partners. 


Ebola is endemic to the DRC. The latest outbreak occurred in 2017, 
and was quickly contained. The WHO attributed this success to the 
quick testing of blood samples, announcing the outbreak early, and 
a rapid response from health authorities. 


Department of Defense Response 

The Joint Program Executive Office for Chemical, Biological, 
Radiological, and Nuclear Defense (JPEO-CBRND) and the Joint 
Program Management Office for Medical Countermeasure Systems 
(JPM-MCS) played a significant role in the U.S. Government's 
response to the recent outbreak. The JPEO-CBRND is the 
Department of Defense’s single focal point for the developing and 
fielding of chemical, biological, radiological, and nuclear defense 
equipment and medical countermeasures. 


The JPEO recognized the importance of contributing to the Ebola 
effort based on its previous and continued development of medical 
countermeasures to protect, identify, and treat the Ebola virus. “We 
are utilizing a whole-of-government approach to this outbreak. While 
our mission space is ultimately to support the warfighter, it is our 
imperative to support our domestic and international public health 
partners in this response,” said Mr. Douglas Bryce, Joint Program 
Executive Officer for JPEO-CBRND. “We are supporting the global 
community in whatever way we can.” 


Prevention 


The MCS product offices responsible for vaccine and platform 
technology development engaged in multiple efforts to accelerate 
development of Ebola vaccine candidates. Of note, the WHO and 
Médecins Sans Frontiéres (MSF) utilized the Department of Defense- 
funded r-VSVAG-ZEBOV vaccine in a ring vaccination campaign. 
MCS, in collaboration with the Defense Threat Reduction Agency, 
provided early testing and development activities for the vaccine. 
During the exercise, more than 3,000 people were vaccinated. 

MCS currently maintains a 23,000 multi-dose vial stockpile of the 
investigational vaccines in case additional doses are required. 


Diagnosis 

The WHO developed a National Laboratory Strategy in response to 
the outbreak. As part of this effort, the Defense Biological Product 
Assurance Office deployed two co-funded Nebraska Strategic 
Research Institute personnel to support sequencing efforts. The 
personnel took sequencing equipment, consumables, and reagents 
to establish a sequencing capability at the Institut National de 
Recherche Biomédicale within the DRC. This capability sequenced 
Ebola samples from the outbreak so the data can be shared with the 
biodefense community on critical reagents for the Ebola virus. 


Treatment 

MCS's biological therapeutics office worked closely with the WHO 
and MSF on their Ebola drug treatment in development. The drug 
Remdesivir, developed by Gilead Sciences and funded by the 
Department of Defense, was deployed to the DRC for emergency 
use during the outbreak. 


In addition, the MCS platform technologies office partnered with the 
National Allergy and Infectious Disease Vaccine Research Center to 
rapidly manufacturing their VRC-114 anti-Ebola monoclonal antibody 
in support of conducting clinical trials for a vaccine and therapeutic 
in the DRC. By leveraging the DOD’s existing monoclonal antibody 
platform and Advanced Development and Manufacturing facility, 
MCS can rapidly produce additional doses of the VRC-114 to be 
used in treating Ebola infected individuals, thereby accelerating FDA 
approval of an Ebola virus therapy to U.S. Forces and the WHO. 


Results 


The current Ebola outbreak is contained, thanks in part to the JPEO- 
CBRND’s support to U.S. Government and international partners. 
The DRC Ministry of Health and WHO announced the official end of 
the outbreak on July 24. According to the WHO, there still remains 

a risk for resurgence or flare-ups posed by potentially undetected 
transmission chains, such as the latest outbreak in eastern DRC 
that began at the beginning of August. However, strengthened 
surveillance mechanisms and survivor monitoring programs are in 
place to rapidly mitigate, detect, and respond to such events. 


COL David Hammer, Joint Project Manager for MCS praised the 
Department’s response to the outbreak. “Our team contributed 
greatly to containing the outbreak. The WHO's medical 
countermeasure options would've been constrained if the products 
developed by JPEO-CBRND and the DOD were not utilized. We 
couldn’t have provided support without collaborating with our 
inter- and intra-agency partners, local teams, and the international 
community. The DOD's response saved lives,” he said. 


About Medical Countermeasure Systems 

JPM-MCS, a component of the U.S. Department of Defense’s Joint 
Program Executive Office for Chemical, Biological, Radiological, and 
Nuclear Defense, aims to provide U.S. military forces and the nation 
with safe, effective, and innovative medical solutions to counter 
chemical, biological, radiological, and nuclear threats. 

To contact MCS: usarmy.detrick.dod-jpeo-cbrnd.list.communications@mail.mil 
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Biomedical science training 


Cherie Butts! and Avery August? 


tackle the challenges of making better medicines remains 

high; however, few scientists and clinicians learn about drug 
development during their training. To assist trainees with ap- 
preciating differences between basic science (understanding 
disease mechanisms) and applied science (drug development), 
Biogen and the Cornell Broadening Experiences in Scientific Train- 
ing (BEST) program convened a conference in June 2018 at the 
Biogen headquarters in Cambridge, Massachusetts (#Biogen 
BESTDDConf2018). 
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FIGURE 1. 


Selection process 

Participants were identified primarily from academic institutions with 
U.S. National Institutes of Health (NIH) BEST programs (www.nihbest.org; 
https://commonfund.nih.gov/workforce), as they are familiar with biophar- 
ma career pathways. Trainees* were exposed to key drug development 
questions, different roles in and out of the laboratory or clinic, and skills 
needed to be successful in biopharma. To ensure that information from 
the conference extended beyond those who attended, a requirement was 
that trainees share key concepts with others at their home institutions. 


Unique approach 

The average time for developing a new drug is approximately 12 
years and costs over USD 1 billion, predominantly due to failures at 
each stage of drug development (7). An appropriately trained work- 
force is one mechanism for accelerating timelines and reducing the risk 
of failure. As many biomedical sciences training programs do not offer 
activities related to drug development, trainees must opt for additional 
specialized fellowships (ranging in duration from several weeks to a 
few years) or transition to industry with little knowledge of the skills 
necessary to be successful in this sector. As an initiative of Biogen’s 
Portfolio Transformation, a short-term, intensive conference was devel- 
oped to demystify drug development for academic trainees. The goal 
was to create a model for similar events across the country. 

High-performing project teams are a hallmark of biopharma, but 
are less common in academia (2). Conference activities, therefore, fo- 
cused on providing participants with a project-team experience that 
highlighted key drug development questions; stage-appropriate com- 
position of project teams; the importance of team dynamics and of 
maximizing the strengths of each member; and how the biopharma 
ecosystem supports project teams. 


‘Portfolio Transformation and Late-Stage Clinical Development, Biogen, Cambridge, MA; 
cherie.butts@biogen.com Cornell BEST Program and Department of Microbiology and Immunology, 
Cornell University, Ithaca, NY; averyaugust@cornell.edu 


Purposeful outcomes 

Participants were introduced to the drug development process (from 
concept to approval), and sessions were led by individuals from across 
Biogen, who offered insight on their roles—including how they support 
project teams. The topics included asset management, biomarker de- 
velopment, business and data analytics, clinical development, medical 
affairs, portfolio management, protein engineering, and regulatory af- 
fairs and policy (Figure 1). In addition, participants served on teams 
that generated a business case and recommendations for progression 
of a mock project to the next drug development stage (Figure 2). 


Refining and reframing 

A new training model is needed to strengthen and refine the neces- 
sary skills for those who wish to translate new biomedical discoveries 
into beneficial drugs. More trainees with the right experience will in- 
crease the pace of drug development, reducing the burden of debili- 
tating medical conditions on society. Such a reframing of the training 
experience will positively change the conduct of science and expand 
the ways that meaningful contributions to biomedical science are 
defined. This conference emphasizes the importance of experiential 
learning and serves as a model for such training. 


Conference format 


Participants complete preconference surveys to 


Team identify expertise and professional interests. 


selection 


Teams receive topics and supporting 
information to generate business case. 


Tailored 
team 
experience 


Team-building emphasizes value of 
shared responsibility. 


+= 


Feedback assesses impact of conference. 


Topic 
f=) (Yee) Session leaders discuss their career paths, current roles, 
and contributions to project teams. 


FIGURE 2. 
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For additional information and to explore future opportunities with 
this drug development training model, please contact the authors: 
cherie.butts@biogen.com and averyaugust@cornell.edu. 


“40 Ph.D. students, postdocs, and medical students from Boston University; 
Cornell University; Emory University; Georgia Institute of Technology; Johns 
Hopkins University; Meharry Medical College; Michigan State University; New 
York University; Rutgers University; Universities of California at Davis, Irvine, 
and San Francisco; Universities of Colorado at Anschutz and Denver; University 
of North Carolina at Chapel Hill; University of Massachusetts Medical School; 
University of Rochester; Vanderbilt University; Virginia Polytechnic Institute and 
State University; and Wayne State University. 
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BROADER IMPACT 
BIO1 High Impact Practice—Increasing Ethnic and Gender Diversification 
in Engineering Education 


CHARACTERIZATION, PROCESSING AND THEORY 
CP01 Advances in /n Situ Experimentation Techniques Enabling Novel and Extreme 
Materials/Nanocomposite Design 
CP02 Design and /n Situ TEM Characterization of Self-Assembling Colloidal Nanosystems 
CP03 Advances in /n Situ Techniques for Diagnostics and Synthetic Design 
of Energy Materials 
CP04 Interfacial Science and Engineering—Mechanics, Thermodynamics, 
Kinetics and Chemistry 
CP05 Materials Evolution in Dry Friction—Wicrostructural, Chemical 
and Environmental Effects 
CPO06 Smart Materials for Multifunctional Devices and Interfaces 
CP07_ From Mechanical Metamaterials to Programmable Materials 
CP08 Additive Manufacturing of Metals 
CP09 Mathematical Aspects of Materials Science—Modeling, Analysis and Computations 


ELECTRONICS AND PHOTONICS 

Soft Organic and Bimolecular Electronics 
EP01 Liquid Crystalline Properties, Self-Assembly and Molecular Order 

in Organic Semiconductors 
EP02 Photonic Materials and Devices for Biointerfaces 
EPO03 Materials Strategies and Device Fabrication for Biofriendly Electronics 
EP04 Soft and Stretchable Electronics—From Fundamentals to Applications 
EPO5 Engineered Functional Multicellular Circuits, Devices and Systems 
EPO6 Organic Electronics—Materials and Devices 

Semiconductor Devices, Interconnects, Plasmonic 

and Thermoelectric Materials 
EPO7 Next-Generation Interconnects—Materials, Processes and Integration 
EPO8 Phase-Change Materials for Memories, Photonics, Neuromorphic 

and Emerging Application 
EPO9 Devices and Materials to Extend the CMOS Roadmap for Logic 

and Memory Applications 
EP10 Heterovalent Integration of Semiconductors and Applications to Optical Devices 
EP11 Hybrid Materials and Devices for Enhanced Light-Matter Interactions 
EP12 Emerging Materials for Plasmonics, Metamaterials and Metasurfaces 
EP13 Thermoelectrics—Materials, Methods and Devices 


www. mrs.org/spring2019 


Don’t Miss These Future MRS Meetings! 


2019 MRS Fall Meeting & Exhibit 
December 1-6, 2019, Boston, Massachusetts 


2020 MRS Spring Meeting & Exhibit 
April 13-17, 2020, Phoenix, Arizona 


CALL FOR PAPERS 


Spring Meeting registrations include MRS Membership July 1, 2019 — June 30, 2020 


Abstract Submission Opens 
September 28, 2018 


Abstract Submission Closes 
October 31, 2018 


ENERGY AND SUSTAINABILITY 
Energy Storage 
ESO1 Organic Materials in Electrochemical Energy Storage 
ES02 Next-Generation Intercalation Batteries 
ES03 Electrochemical Energy Materials Under Extreme Conditions 
ES04 Solid-State Electrochemical Energy Storage 
Catalysis, Alternative Energy and Fuels 
ESO5 Cooperative Catalysis for Energy and Environmental Applications 
ESO06 Atomic-Level Understanding of Materials in Fuel Cells and Electrolyzers 
ESO7 New Carbon for Energy—Materials, Chemistry and Applications 
ESO8 Materials Challenges in Surfaces and Coatings for Solar Thermal Technologies 
ES10 Rational Designed Hierarchical Nanostructures for Photocatalytic System 
ES11 Advanced Low Temperature Water-Splitting for Renewable Hydrogen Production 
via Electrochemical and Photoelectrochemical Processes 
ES12  Redox-Active Oxides for Creating Renewable and Sustainable Energy Carriers 
Water-Energy Materials and Sustainability 
ESO9 Advanced Materials for the Water-Energy Nexus 
ES13 Materials Selection and Design—A Tool to Enable Sustainable Materials 
Development and a Reduced Materials Footprint 
ES14_ Materials Circular Economy for Urban Sustainability 
Photovoltaics and Energy Harvesting 
ES15 Fundamental Understanding of the Multifaceted Optoelectronic Properties 
of Halide Perovskites 
ES16  Perovskite Photovoltaics and Optoelectronics 
ES17 _ Perovskite-Based Light-Emission and Frontier Phenomena— 
Single Crystals, Thin Films and Nanocrystals 
ES18 Frontiers in Organic Photovoltaics 
ES19  Excitonic Materials and Quantum Dots for Energy Conversion 
ES20_ Thin-Film Chalcogenide Semiconductor Photovoltaics 
ES21 Nanogenerators and Piezotronics 


QUANTUM AND NANOMATERIALS 

QNO1 2D Layered Materials Beyond Graphene—Theory, Discovery and Design 

QNO2_ Defects, Electronic and Magnetic Properties in Advanced 2D Materials 
Beyond Graphene 

QNO3 2D Materials—Tunable Physical Properties, Heterostructures and Device Applications 

QNO04 Nanoscale Heat Transport—Fundamentals 

QNO5 Emerging Thermal Materials—From Nanoscale to Multiscale Thermal Transport, 
Energy Conversion, Storage and Thermal Management 

QNO6 Emerging Materials for Quantum Information 

QNO7 Emergent Phenomena in Oxide Quantum Materials 

QNO8 Colloidal Nanoparticles—From Synthesis to Applications 


SOFT MATERIALS AND BIOMATERIALS 

SMO01 Materials for Biological and Medical Applications 

SM02_ Progress in Supramolecular Nanotheranostics 

SMO03_ Growing Next-Generation Materials with Synthetic Biology 

SM04_ Translational Materials in Medicine—Prosthetics, Sensors and Smart Scaffolds 
SM05 Supramolecular Biomaterials for Regenerative Medicine and Drug Delivery 
SMO6_ Nano- and Microgels 

SMO07_ Bioinspired Materials—From Basic Discovery to Biomimicry 
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Glass Media Bottles 

PUREGRIP borosilicate glass media bottles 
use a cap with a protruding ridge that 
improves a user's ability to safely handle 
the bottle with gloved hands, especially 
when the bottles are wet. The patented 
cap is easy to open or close, and provides 
a wide, flat surface for writing and 
labeling. Constructed of durable 

3.3 low-expansion ASTM E438 Type | Class 
A borosilicate glass, PUREGRIP bottles come in sizes of 100 mL, 

250 mL, 500 mL, 1L,2L,3L,5L,10L, and 20 L. Each bottle in 

the product line includes volumetric gradations that have been 
individually certified for accuracy. 

Foxx Life Sciences 

For info: 603-890-3699 

www.foxxlifesciences.com 


Contained Filtration System 

The SupaClean system from Amazon Filters is designed for filtration 
applications where quick cleandowns are required in a multipurpose 
production site, or where hazardous materials are being processed 
and risk of operator exposure needs to be minimized. The system 
consists of Amazon Filter’s standard range of high-performance filter 
cartridges enclosed inside two sealed plastic bags, all of which are 
contained inside a stainless-steel housing that provides the pressure 
vessel required for the filtration process. SupaClean is ideally suited 
to the production of coatings. Extensive validation testing ensures 
that the plastic bags enclosing the filters and product always remain 
attached even when full of liquid. Filter changeout is quick and 
simple—the filter assembly can be simply lifted out of the housing 
and replaced by a new unit, with minimal risk of the operator or 
surrounding area encountering the product. 

Amazon Filters 

For info: +44-(0)-1276-670600 

www.amazonfilters.com 


Secondary Antibodies 

Brilliant Violet 421 and Brilliant Violet 480 conjugated secondary 
antibodies allow you to add more colors to your multiple labeling 
assays in the violet-blue region of the spectrum. When combined with 
Alexa Fluor 488, Rhodamine Red-x, and Alexa Fluor 647 conjugates, 
effective five-color fluorescent labeling is possible. If nuclear 
counterstaining is desired, four-color antibody staining is possible 
using Brilliant Violet 421, Brilliant Violet 480, Alexa Fluor 488, and 
Rhodamine Red-X. Switching the nuclear stain from DAPI (emission in 
the blue region) to DRAQS5 (which has red emission) frees the violet- 
blue region of the spectrum to accommodate the two Brilliant Violet 
dyes. DRAQS’s excitation and emission profiles overlap those of 
Alexa Fluor 647. We offer a range of secondary antibodies that 

are recommended for multiple labeling due to their minimal 
cross-reactivity. 

Jackson ImmunoResearch 

For info: 800-367-5296 

www.jacksonimmuno.com 
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new products 


Chemically Competent and Electrocompetent Cells 

Vmax Express is a novel, fast-growing bacterial strain designed 

and optimized for high-level recombinant protein expression. This 
rationally engineered, next-generation prokaryotic protein expression 
system can serve as a replacement for slow-growing Escherichia coli 
systems that are prone to low yields and the expression of proteins 
as insoluble inclusion bodies. Vmax cells are derived from the marine 
microorganism Vibrio natriegens. This gram-negative, nonpathogenic 
bacterium has a doubling time twice that of E. co/i and produces 
three to four times its biomass. Utilizing a robust transcription and 
translation system to support this rapid growth rate, Vmax Express 
generates more biomass and more recombinant protein per liter 
faster. It is compatible with plasmids and antibiotics commonly used 
with bacterial expression systems such as £. coli BL21(DE3). Designed 
with a tightly controlled, IPTG-inducible T7 promoter system, 

Vmax Express cells can be cultured using routine growth medium as 
well as commercial autoinduction media, or our Vmax Enriched 
Growth Media. 

SGI-DNA 

For info: 855-474-4362 

www.sgidna.com/vmax-express.html 


Lentivector Plasmid and Prepackaged Virus 

System Biosciences’ family of Lenti-Labeler constructs facilitate a wide 
range of studies—including cell tracking, high-throughput assays, and 
more—by enabling efficient, reliable labeling of your cells. The 
pLL-CMV-GFP-T2A-Puro Lenti-Labeler construct expresses copepod 
green fluorescent protein (GFP) from the cytomegalovirus (CMV) 
promoter, which delivers strong expression in most commonly used 
cell lines (HeLa, HEK293, HT1080, etc.), and coexpresses the puromycin 
resistance gene for selection in vitro prior to in vivo use. Available as 
either fully propagatable, sequence-verified plasmid DNA or 
ready-to-transduce prepackaged lentivirus, pLL-CMV-GFP-T2A-Puro 
Lenti-Labeler is designed for reliability, so you can get valuable 
insights faster. 

System Biosciences 

For info: 888-266-5066 

www.systembio.com 


F-Actin Recombinant Protein 

Achieve fast staining and immediate functional analysis of 
filamentous actin (F-actin) in living and fixed cells with ibidi’s new 
reagent, LifeAct-TagGFP2 Protein. The addition of this recombinant 
protein extends a nontoxic, noninvasive alternative to phalloidin. 
F-actin is a key component of the cytoskeleton and is involved 

in many important cellular processes, such as cell division, cell 
migration, and endocytosis. LifeAct—a 17-amino-acid peptide 
derived from a protein found in yeast—specifically binds F-actin 
structures in living or fixed eukaryotic cells, while uniquely 
retaining full actin dynamics. Conjugated with green fluorescent 
protein (GFP), it can be conveniently introduced into cells, making 
it the ideal tool for F-actin labeling. With LifeAct-TagGFP2 Protein, 
scientists can easily transfect living cells, then get brilliant 
fluorescence microscopy images of cellular F-actin structures 

for analysis within minutes. 

ibidi 

For info: 844-276-6363 

ibidi.com 
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Peking University 
Third Hospital 


Healthcare, Education and 
Innovation for 60 years 


Jie Qiao, mp. 
President of The Peking University Third Hospital 


2018 marks the 60th anniversary of the founding of Peking Univer- 
sity Third Hospital (PUTH), which was founded in October 1958 
under the supervision of the Ministry of Health. As a hospital affiliated 
to China’ s first national comprehensive university in China, PUTH is 
young and energetic, enjoying a rapid pace of development. Consist- 
ently ranked as a top tertiary hospital nationwide, PUTH integrates 
world-class medical services with cutting-edge medical research and 
first-rate medical education. Currently, it has a total of 7 campuses and 
over 6000 faculty and staff members. It has produced many notable 
scientists with national and international reputations, including one 
academician of the Chinese Academy of Sciences (CAS) and one 
academician of the Chinese Academy of Engineering (CAE). 

Over the past 60 years, PUTH has always ranked among the capi- 
tal’ s top hospitals by providing high quality medical care. In 2017, the 
hospital received more than 4 million ambulatory and emergency 
visits. The annual discharge number was over 100,000, the annual 
operation number over 60,000, and the average length of stay was 5.78 
days. Among the inpatients with intractable diseases, about 1/3 are 
from the other parts of China. 

In accordance with the principle of academic excellence, PUTH 
endeavors to establish world-class disciplines and research platforms, 
cultivate high-level experts with global vision, and provide excellent 
medical education. PUTH boasts one National Clinical Research 
Center, three Ministry of Education Key Laboratories, one Ministry of 
Health Key Laboratory, and over 5,000 square meters for Public Labo- 
ratory Service. Its faculty members have held over 40 positions as 
chief editors of top journals which are included in the China Scientific 
and Technical Papers and Citations Database (CSTPCD), over 110 
positions in academic associations/societies, and won many interna- 
tional awards. 

PUTH has a strong sense of social responsibility and is actively 
engaged in China’ s public hospital reform. In recent years, it has 
undertaken many key national projects and programs, with research 
findings widely adopted by relevant government sectors. With 
improved management skills, it has played an exemplary role in clini- 
cal pathway, care service, counterpart support, and the Aid Program for 
Tibet and Xinjiang. Meanwhile, PUTH is also a care provider for 
unexpected emergencies on important occasions, such as the Beijing 
2008 Summer Olympics, the 60th anniversary of the People's Republic 
of China, and many earthquake and disaster relief projects. 

With 60 years of history, PUTH has created its distinct culture. All 
PUTH people will follow the motto of “Unity, Dedication, Practicality 
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60°Anniversary of Peking University Third Hospital 


1958-2018 


and Creativity” , and through high medical skills and a strong sense of 
duty, strive to become a medical center of national prominence, a diag- 
nosis and treatment center for rare and grave diseases, a research 
center for clinical medicine, and a training base for medical experts. 


The Department of Cardiology 

The Department of Cardiology, as a key clinical discipline, boasts a 
key Cardiovascular Molecular Biology and Regulatory Peptides Labo- 
ratory. Committed to addressing the clinical problems of major cardio- 
vascular diseases, our department conducts basic research on the risk 
factors and pathogenesis of coronary heart disease and heart failure, 
precautionary function of biomarkers, and the protective mechanisms 
of cardiac rehabilitation, from the aspects of susceptibility genes, 
biomarkers, receptors in cardiovascular diseases, drug therapy and 
exercise rehabilitation. Taking full advantage of its asset as part of a 
comprehensive hospital, our department has made progress in many 
fields, such as the diagnosis and treatment of critical cardiovascular 
disease, the evaluation of coronary artery disease, hybrid coronary 
intervention (HCR) and minimally invasive surgical bypass grafting, 
cardiac rehabilitation and cardiovascular drug safety. Over the past 
decade, the department has built and improved its many platforms, i.e. 
for cardiovascular function analysis, omics and medical bioinformat- 
ics, clinical trial on internationally standardized drugs, and cardiac 
rehabilitation. It has established a coronary heart disease cohort, a 
biorepository containing over 13, 000 samples, as well as a big data 
platform for Cardiovascular Disease Resource Repository, which 
collects research data on coronary heart disease, heart failure, hyper- 
tension, and arrhythmia with a follow-up system for quality assurance, 
laying a solid foundation for clinical research. 


The Department of Gynecology and Obstetrics 

Founded in 1958, the Obstetrics and Gynecology (OB-GYN) 
Department at PUTH is famous as the birthplace of the first 1VF-ET 
baby in mainland China. As a national clinical and applied research 
unit for reproductive health and related diseases, it has been designated 
as the national Clinical Research Center for Obstetrics and Gynecolo- 
gy as well as the base for medical cooperation on Reproductive Health 
and Population between WTO and Peking University (PKU). In 2016 
and 2017, our department ranked No.1 in terms of its scientific and 
technologic influence. 

Funded by the Ministry of Science and Technology (MOST), the 
Ministry of Education (MOE), the Education and the Natural Science 


ADVERTISEMENT 


Medical research. 


Operating Theatre 


Foundation of China (NSFC), our department has embarked on 
research into the molecular mechanism of human gamete development 
control and the pathogenesis of common reproductive diseases, and 
built comprehensive platforms for reproductive endocrine epidemiolo- 
gy research, preimplantation genetic screening (PGS) development , 
single-cell multi-omics sequencing and bioinformatics analysis. 
Among its many other achievements is the genetic, epigenetic and 
transcriptomic map construction of human gamete in its embryonic 
development, the establishment of gene regulatory networks (GRN), 
together with several clinical RCT studies. By now 315 SCI papers 
have been published by such journals as Cell, Nature, Lancet, and 
JAMA. 

In the future, the OB-GYN Department will carry out more clinical 
and transformational research and establish more platforms for interna- 
tional cooperation for the improvement of reproductive health in 
China. 


The Department of Ophthalmology 

Established in 1958, the Department of Ophthalmology has been 
committed to clinic service, professional training and medical 
research. Our highly skilled ophthalmologists and staff provide a com- 
plete range of services on cataract, glaucoma, cornea and external 
disease, retinal diseases, pediatric ocular disorders, refractive and plas- 
tic surgery. Our clinical expertise and sophisticated diagnostic and 
treatment procedures make our department a nationwide referral 
center. 

Currently, we have more than 170 faculty members, including 13 
full professors and 21 associate professors, with an annual average of 
about 260,000 outpatients and 16,000 surgeries. In 2017, our depart- 
ment was ranked 6th nationwide in terms of its scientific and techno- 
logical influence. 

In the last five years, our physician-scientists have received more 
than 20 research grants worth about RMB 15 million from the govern- 
ment and 41 grants approximately RMB 11.62 million from NGOs. 
We published more than 90 peer-reviewed papers and boasted 20 
patent claims. As the base of Beijing Key Laboratory of Ophthalmolo- 
gy, we focus our research on the following fields: 

1.Stem cell study in glaucoma and Age-related Macular Degenera- 
tion (AMD) 

2. Neuroprotection of retina ganglion cell 

3. Clinical trial on ocular trauma, secondary glaucoma and corneal 
diseases 

4. Visual stimulation and vision-related neuroplasticity 

5. Ocular Surface Bioengineering 

We will keep updating knowledge and technology to provide greater 
customer service. 


The Department of Orthopedics 

Since its establishment in 1958, the Department of Orthopedics at 
PUTH has been the pioneer in the surgical treatment of lumbar disc 
herniation and cervical spondylosis, and won great fame ever after, 
particularly for its spine surgery, at home and abroad. After 60 years of 
unremitting efforts, our department has become an integrative medical 


Outpatient Clinic 
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center specialized in spine, joints and trauma related diseases. Com- 
mitted to medical treatment, training, research and recuperation, our 
department is one of the most important orthopedic centers in China 
and ranks top three nationally in its discipline recognition and scientif- 
ic & technologic influence. 

Guided by the motto, “Healing with compassion, leading with excel- 
lence” , our department continues to gather and cultivate talents as the 
main strategy for improving its overall strength, to develop and adopt 
innovative treatment technologies to facilitate clinical diagnosis, and to 
conduct research on both common and intractable orthopedic diseases. 

In recent years, our department has intensified the establishment of 
research platforms. Now we boast a complete set of platforms for clin- 
ical research, basic research and translational research, having made 
breakthroughs in the development of orthopedic implant devices, 
particularly by adopting microporous titanium alloy in spine surgery 
with the aid of 3D printing technology. 


The Department of Sports Medicine 

Peking University Institute of Sports Medicine, established in Janu- 
ary 1959 by Professor Qu Mianyu, is the oldest and most prestigious 
of its kind in China. It provides a full range of medical services, teach- 
ing and research in four subspecialties: sports traumatology, sports 
rehabilitation, sports nutrition, and medical supervision. 

As the only Prevention and Treatment Center for Sports Injuries and 
Diseases designated by China General Administration of Sport 
(SGAS), our institute takes the lead in every aspect, providing medical 
coverage for 36 national and Beijing municipal sports teams, attending 
to about 15,000 athletes (over 60 Olympic champions) every year. It 
also provides top-level healthcare for the general community, with 
over 120,000 outpatient visits and 7,200 surgical operations per year. 
For more than a half century, it has made exceptional contributions to 
the success of China’ s competitive sports in the world. 

As the base of the Beijing Key Laboratory of Sports Injuries, the 
institute has formed a multidisciplinary research platform including 
molecular biology, cell biology, histopathology, biomechanics, molec- 
ular imaging, materials science and tissue engineering, etc. Its research 
interests mainly include: 

1. understanding the pathogenesis of common sports injuries and 
degenerative joint diseases, and identifying novel therapeutic 
approaches; 

2. tissue engineering technology for treatment of joint injuries; 

3. the effects of exercise on human physiology and biochemistry and 
the underlying mechanism; 

4. functional assessment and human performance. 


PUTH cordially welcomes job applicants and visiting scholars 
with expertise in related areas. Feel free to contact us: 

Website: http://www.puh3.net.cn 

Email: puthdyb@bjmu.edu.cn 

Tel: +86-10-82266699 

Fax: +86-10-62017700 

Address: Peking University Third Hospital, 49 North Garden Road, 
Beijing, China. 100191 
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POSTDOCTORAL OPPORTUNITIES 


Research Fellow Position with CDC/NIOSH 
Morgantown, WV 


The National Institute for Occupational Safety 
and Health (NIOSH) at CDC is pleased to announce 
a research fellowship opening in the Receptor Biolo- 
gy Laboratory, located in Morgantown, WV. The re- 
search aims at understanding molecular and mechanistic 
events involved in the development of occupational 
diseases, with potential emphasis on pulmonary inflam- 
mation, fibrosis, and cancer that result from exposure 
to particulates, nanomaterials, and small chemicals. 

QUALIFICATIONS include: (1) A Ph.D. or Master’s 
degree in molecular biology, biochemistry, immunol- 
ogy, toxicology, pharmacology, or cancer research; 
(2) Experience conducting in vivo and in vitro studies 
evaluating molecular and mechanistic changes as a 
result of exposure; (3) Experience in developing and 
characterizing rodent models of pathologic pheno- 
types is desirable; (4) Excellent communication skills 
and the ability to work effectively and collegially. A 
strong background in molecular and transgenic ap- 
proaches relating to disease and pathologic effects, 
and a good understanding of and experience in bio- 
chemistry, immunology, toxicology, and lung disease 
are highly desirable. 

Salary is dependent upon academic degree and expe- 
rience. NIOSH is an Affirmative Action/Equal Opportunity 
Employer. Send a letter with research experience and 
interests, Curriculum Vitae, and contacts of three ref- 
erees via email to: Dr. Qiang Ma at qam1@cdc.gov. 


VANDERBILT UNIVERSITY 
MEDICAL CENTER 


An immediate opening for a postdoctoral position 
exists at Vanderbilt University Medical Center. The 
project is to examine the role of tumor-initiating cells 
in mouse models of colonic neoplasia using unique 
reporter mice generated in the lab. Individuals with 
experience in mouse genetics and confocal imaging/ 
high resolution microscopy are encouraged to apply. 
Familiarity with modern methods for precise lineage- 
tracing, as well as examining high-dimensional data with 
multiplex immunofluorescence, smFISH, and single- 
cell RNA-seq, are desirable. Funding is secure, along 
with a highly competitive salary and fringe benefits. 
Those interested should apply online via Science Ca- 
reers and may contact Bob Coffey at e-mail: robert. 
coffey@vumce.org. 


The National PKU Alliance (NPKUA) works to 
improve the lives of individual with phenylketonuria 
and pursue a cure. NPKUA is pleased to release its 
2019 Call for Research Proposals and Fellowships to 
continue this mission. Since 2010, NPKUA has in- 
vested $3 million in research that has led to new knowl- 
edge, acceleration of new therapies, and supported pilot 
studies that have enabled larger federal funding op- 
portunities. More information can be found on the 
NPKUA.org website under Scientific Grant Request. 
For more information email lex.cowsert@npkua.org. 


Search more jobs online 


Access hundreds of job postings 
on ScienceCareers.org. 


Expand your search today. 


STANFORD 


SCHOOL OF MEDICINE 


POSTDOCTORAL FELLOW 


The Lab of Dr. Idoyaga in the Microbiology and 
Immunology Department at Stanford University 
(http: //idoyagalab.stanford.edu/) seeks a Postdoctoral 
Fellow whose position will focus on the role of den- 
dritic cells and microbiota during skin cancer pro- 
gression. The successful candidate will use, among 
other techniques, Cy TOF and RNAseq. The applicant 
must have, or be nearing completion of, a Ph.D. or 
M.D./Ph.D., and be familiar with FACS, immune cell 
isolation and mouse handling. Please send Curriculum 
Vitae and a list of 3 references to e-mail: jidoyaga@ 
stanford.edu. Stanford University is an Equal Opportu- 


nity Employer. 


MEDICAL SCHOOL 


UNIVERSITY OF MICHIGAN 
POSTDOC PREVIEW 
March 7-8, 2019 
University of Michigan, Ann Arbor 


Fully funded travel for senior graduate students to 
interview with prospective mentors and tour campus. 
Applications open November 20, 2018. 

Learn more at website: http://bit.ly/UMPostdoc 
Preview or contact us at e-mail: postdocoffice@med. 
umich.edu 
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A unique calling: Careers in career 
development for STEM doctorates 


Some people find they are more geared to careers helping others in their field than pursuing the field itself. In that vein, some 
Science, Technology, Engineering, and Mathematics (STEM) Ph.D.s are inspired to pursue careers in counseling STEM postdocs 
about their careers—sometimes by STEM and other Ph.D.s who are career counselors themselves. By Alaina G. Levine 


aleb C. McKinney, assistant dean of graduate 

and postdoctoral training and development 

at Georgetown University Medical Center 

in Washington, D.C., laughs when he thinks 

about how he maneuvered his Ph.D. in virology 
into a career in career development. As a postdoc at the 
U.S. National Institutes of Health (NIH), he wa s training 
students in his group on the practicalities of conducting 
research and found the experience to be personally 
transformative. 

“| was writing a letter of recommendation for a student 
| had helped when | realized that | wanted to have these 
‘realized’ moments on a bigger scale,” McKinney says. 

He approached the Office of Training and Diversity at 
the National Institute of Allergy and Infectious Diseases 
(NIAID) about assisting them with their efforts, and 
subsequently volunteer-coordinated activities that 
fostered the professional development of NIAID fellows. 
His career, and his bliss, were on their way. 

Some scientists and engineers who have navigated 
the knotty question of “what should | do with my life?” 
choose to remain on “Rue de Research” and pursue 
traditional academic professions. Then there are a few, 
like McKinney, who respond by deciding they want to help 
other Ph.D.s find impactful careers. And that’s why we are 
seeing the emergence of the still fledgling field of career 
development for doctorates in STEM now buoyed by 
STEM doctorates themselves. 


Upcoming features 


One career, many paths 

Along these lines, there is now 
a growth in formal institutional 
administrative divisions such as 
postdoc affairs offices, which 
help STEM grad students and 
postdocs think about their career 
opportunities. There are also 
organizations with missions to 
advance the careers of those 
in the career development 
profession, such as the Graduate 
Career Consortium (GCC) and the National Postdoctoral 
Association (NPA). Additionally, “train-the-trainers” 
programs, such as those organized by NIH that support 
the knowledge growth of investigators and other mentors 
who are working with protégés, are making an impact in 
expanding this profession. 

“It's growing by leaps and bounds,” says Patrick Brandt, 
director of career development and training in the Office of 
Graduate Education at the University of North Carolina at 
Chapel Hill. “10 years ago, there weren't many institutions 
hiring Ph.D.-trained professionals in this area.” 

Natalie Lundsteen, assistant dean for career and profes- 
sional development at the University of Texas Southwestern 
Medical Center in Dallas (UT Southwestern), says her career 
was launched when she noticed “a need for someone with ca- 
reer skills and a Ph.D. to work with grad students, cont.> 
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which aligned with a big explosion in the world 
of career development in the late 2000s.” Her 
dissertation research followed students pur- 
suing internships at London banks, through 
which she discovered that the skills students 
need to succeed in a workplace are not neces- 
sarily linked to skills gained in academia. This 
finding inspired her to take on a career de- 
velopment role at the Massachusetts Institute 
of Technology, and she was subsequently re- 
cruited to UT Southwestern to build its career 
development division from scratch. Today, she 
assists biosciences Ph.D. students with their 
career strategies and is actively involved with 
GCC. Many of her advisees have gone on to 
careers in career development themselves. 

To paraphrase an old Paul Simon tune, there 
are "50,000 ways” to leave your research and 
arrive at career development as a career. Im- 
munologist Lia Paola Zambetti used commu- 
nications to do so. She was a research fellow 
at the Singapore Immunology Network (SIgN) 
of the Agency for Science, Technology and 
Research (A*STAR) in Singapore, but wanted 
“to get out of the academic grind and find a 
job that was not related to the bench.” She 
had already been engaged in science com- 
munications and had been freelancing as a 
popular science writer for several years. Through networking, 
she found a position in anew communications office at A*STAR. 
After three years there, she secured her current position at the 
University of Sydney, where she manages a fellowship program 
and organizes trainings in soft skills, such as public speaking, 
networking, and leadership for early-career researchers. 

Tracy Costello's path was crystalized as a postdoc in genet- 
ics and biostatistics at the University of Texas MD Anderson 
Cancer Center in Houston. While still a fellow, she volunteered 
with and later served on the board of directors of NPA. “It so- 
lidified for me that | had the ability to impact people beyond 
my particular time, in that some of the things | was working on 
might not benefit me or my peers, but will benefit postdocs in 
the future,” she says. “It was very freeing when | realized this 
was the direction | wanted to go.” 

After finishing her appointment, Costello did a brief foray 
in industry, but was quickly recruited back to MD Anderson to 
shape and launch its postdoc affairs office. An invitation to do 
the same at the Moffitt Cancer Center in Tampa, Florida, came 
four years later. Today, in addition to her job, she is serving as 
the chair of NPA. 


What it takes 

Each career development job is slightly different from 
the next, depending on many diverse factors including the 
institution’s culture, funding stream, and history of career 
advising. Some STEM career development specialists 
report to the dean of the graduate school or vice president 
for student affairs and serve on a team of career advisors. 
Alternatively, they could be organized under the vice provost 


“It solidified for me 
that | had the ability to 
impact people beyond 
my particular circle, in 
that some of the things 
| was working on might 
not benefit me or my 
peers, but will benefit 
postdocs in the future.” 

— Tracy Costello 
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for research or graduate education and enjoy 
the title of dean or assistant dean themselves. 
Other positions are more solitary, such as in 
postdoc affairs offices, which could consist of a 
single person—the director—who serves as both 
program manager and support staff. 

“You will wear many hats,” explains 
Lundsteen, “including event planner, 
conflict resolver, manager, public relations 
manager, public speaker, salesperson, 
counselor, resource gatherer, and writer.” 
Other responsibilities could include 
strategic planning, stakeholder relations, 
communications, project and program 
management, logistics, negotiation, policy, and 
fundraising. Luckily, many of the skills required 
to be successful in the career development 
arena mirror those that Ph.D. scientists and 
engineers have acquired through their research 
training, such as proposal development and 
data collection. 

Being comfortable with event planning is 
generally a must, as the career development 
professional will be organizing all kinds of 
trainings, workshops, networking mixers, and 
speaker series for their constituents—often sev- 
eral events at once. Brandt adds that having a 
“service mindset” will help with event-planning. 
“You have to be OK with doing mundane tasks, such as reserv- 
ing rooms and caterers and inviting keynote speakers and ex- 
plaining to them what you want them to speak about,” he says. 

Naturally, strong communication skills are critical. “It’s all 
about communicating backwards and forwards with the stu- 
dents, employers, alumni, and entities in the community,” says 
Lundsteen, who regularly stays apprised of industry trends for 
her charges by interfacing with the regional chamber of com- 
merce and reading the Dallas Business Journal. "| am a clear- 
inghouse for opportunities and information, and my job is to 
be objective and to present alternatives to students.” 

Costello adds that the most important skills for success 
in this sector are being able to listen, to ask insightful ques- 
tions, and to provide clients “a safe space to explore what 
they want to do.” 


Making the transition 

For many career development professionals, the seeds for 
their career advancement were planted in the institutions in 
which they conducted their Ph.D. or postdoc research. They 
sought out opportunities to volunteer, assisted in career coun- 
seling efforts, did informational interviews, and demonstrated 
to the community that this was their passion and aspiration. 

“The number one way to position yourself for a role like this 
is to get involved, whether it is at your local university or at the 
national level,” urges Costello. “It’s critical that people have 
a microexperience of a career, whether it is an internship or 
volunteer experience. It's a huge plus, because when you're 
actually applying for the job, you can say you are already en- 
gaged init.” cont.> 
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*e Faculty Cluster Hire in 


e Artificial Intelligence 


The University of Texas at San Antonio (UTSA) is seeking candidates to fill eight (8) faculty positions 
to foster collaborative research, education and outreach and to create interdisciplinary areas of 
knowledge that will advance the field of Artificial Intelligence (Al). All positions are either Tenure- 
Track Assistant, Associate or Full Professor level. 


---% College of Architecture, Construction & Planning 
Construction Science (1) — https://jobs.utsa.edu/postings/10151 

Specializing in decision-making, decision support, unsupervised deep learning, current and future 
data for Internet of Things (loT) and smart development of urban environments. 


--% College of Business 
Management Science & Statistics (1) — https://jobs.utsa.edu/postings/10148 
Specializing in applied statistics and experience with Al, machine learning, and operational research 


within a multidisciplinary environment. 


Information Systems and Cyber Security (2) — https://jobs.utsa.edu/postings/10100 
Specializing in conducting research and developing tangible solutions to security challenges, 
particularly with gathering intelligence from unstructured data sets in real-time. 


--% College of Engineering 

Electrical and Computer Engineering (1) — https://jobs.utsa.edu/postings/10159 
Specializing in Al, as it relates to smart health systems, electronic health records, digital health 
science, cloud computing, public health, or diagnostic radiology/computational imaging/image 


processing. 


Psychology (1) — https://jobs.utsa.edu/postings/10099 

Specializing in learning in complex data environments, resources-constrained Al processing, 
generalizable and predictable Al, deep learning, natural language processing, machine intelligence, 
super-intelligence, logics for intelligent interaction, logic for multi-agent systems in Al human factors, 
cyber psychology, privacy issues and healthcare applications. 


--% College of Public Policy 
Demography (1) — https:/jobs.utsa.edu/postings/10101 

Specializing in predictive modeling and data visualization for healthcare demand/burden, urban 
environments and planning, food environments and systems applications. 


--% College of Sciences 
Computer Science (1) — https://jobs.utsa.edu/postings/10158 
Specializing in cyber adversarial learning, resource constrained Al, or Al as it relates to cloud 


computing, bioinformatics and other health-related applications. 


Details/To apply: http://research.utsa.edu/ai 


As an Equal Employment Opportunity and Affirmative Action employer, it is the policy of The University of 
Texas at San Antonio to promote and ensure equal employment opportunity for all individuals without regard to 
race, color, religion, sex, national origin, age, sexual orientation, gender identity, disability, or veteran status. The 
University is committed to the Affirmative Action Program in compliance with all government requirements to 
ensure nondiscrimination. The UTSA campus is accessible to persons with disabilities. 
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Network to work 


Georgetown University Biomedical 
Graduate Education, Postdoctoral 
Development 
biomedicalprograms.georgetown. 
edu/postdoc 


University of Sydney 
sydney.edu.au 


UT Southwestern Medical Center, 
www.utsouthwestern.edu/education/ 


graduate-school/about-us/career- 
services 


2 Moffitt Cancer Center 
www.moffitt.org 


Graduate Career Development Office 


Some career 
development jobs 
for STEM postdocs 
are advertised, but 
Costello notes that 
relatively few open 
up each year—and 


Peters Lab, Maastricht University 
www.maastrichtuniversity.nl/ 
peter.peters 


UNC School of Medicine, Office 

of Graduate Education, Science 
Training and Diversity (STaD) 
www.med.unc.edu/oge/stad/about-us 


Additional resources 


Graduate Career Consortium 
gradcareerconsortium.org 


on © 


National Postdoctoral Association 
www.nationalpostdoc.org 


as is the case for 
most of the sources 
interviewed here, 
they usually get their 
positions because 
their reputation 

is known among 


Peter J. Peters 


Seek out these opportunities early, says Lundsteen, by 
getting involved in your university's grad student or postdoc 
association. It’s even better to join these organizations’ 
career committees so you get a greater “understanding 
the mechanics of the job,” she advises. And take heed-—if 
your institution doesn’t have one of these organizations or 
committees, why not be an innovator and start one yourself? 

One important aspect of making the transition is 
to ensure proper communications with your mentor. 
Regarding principal investigators (Pls) who have already 
demonstrated that they are open to you pursuing nonbench 
careers, it would be prudent to start a discussion with them 
early on to safeguard that smooth transition. With mentors 
on the other end of the spectrum, who may be less than 
enthused if you suggest you want to do anything outside 
the ivory tower, you should be careful about when and how 
you broach the subject, and try to do so in a safe manner 
that doesn’t damage your relationship or endanger your 
employment arrangement. 

McKinney took extra care to involve his Pl in his plans early 
on. “I started developing project management platforms 
and working with my Pl every week to make sure | was on 
target with my experimental deliverables, so | could get that 
extra time in the volunteer experience,” he says. “| kept her 
informed as a key stakeholder.” 

But while you are engaging your PI and looking to doa 
side gig to gain experience, it’s important not to sacrifice 
your research. “Be good in the lab because you want to 
have high credibility. You have to be taken seriously,” warns 
Peter J. Peters, university professor and Limburg Chair at 
Maastricht University in The Netherlands. 

While serving as the dean of postdoctoral affairs at the 
Netherlands Cancer Institute, Peters built the Postdoc Career 
Development Initiative (PCDI) to mentor and inspire young 
researchers at early stages of their scientific careers; it was 
later formally funded by the Dutch Ministry of Economic 
Affairs and became an independent organization. “People 
need to recognize you as someone who is good at science 
and a good citizen in the institution. Then, the director will 
give you money for your ideas,” he says. “If you are sloppy in 
your work as a postdoc, you won't get momentum for your 
work at the institution.” 


decision-makers and 
institutions. “It's gratifying to see the number of positions 
increase and the growth in this area, although it is still 
difficult to get into it,” says Brandt. “You have to be willing to 
move where the jobs are.” 

You also have to network. “It doesn’t matter if you are shy 
or introverted, nobody will do the work for you if you don’t 
take ownership of your path,” says Zambetti. “The only way 
out is networking—it’s painful and tough and excruciating at 
the start, but it does get easier with practice.” 

Fortunately, most counselors are happy to help others 
who want to explore this profession. “Anyone in this 
profession would be willing to have someone shadow 
them or sit in on appointments,” says Lundsteen. “We 
are the best people to ask for an informational interview 
because what we do for a living is tell people to do 
informational interviews.” 


The payoff for this path 

One of the features of this career path that is especially 
gratifying for those with STEM degrees is that they get to 
remain a part of the scientific enterprise while they influence 
the next generation of scientists. Brandt loves the fact that 
he is still an active participant in higher education. He still 
publishes, although now it is in education research. 

“I like the flow of the academic year, being on campus, 
and hearing the bell tower chime-—I feel like I’m an 
academician,” he says. “| also love the science and still feel 
like | get to vicariously enjoy it.” 

And then there is the definitive payoff. “I love my job 
every day because | get to help people figure things out. 
Hopefully they don't feel the pain and struggle | felt at not 
knowing where to go if it’s not going to be faculty,” says 
Costello. “Most mentors don’t know how to guide you in any 
of the other career opportunities.” 

Lundsteen agrees: “My life’s mission is showing people 
their capabilities and possibilities, and that brings me the 
greatest joy. | see I'm making a difference. | have the luxury 
of helping people and having them write me and say, ‘You 
played a part.”” 


Alaina G. Levine is a science writer, science careers consultant, professional 
speaker, and author of Networking for Nerds (Wiley, 2015). 
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WARREN ALPERT 


OUNDATIO 


NEW POSTDOCTORAL FELLOWSHIP 


The Warren Alpert Foundation announces the creation of the Warren Alpert Distinguished 
Scholars program that supports individual scientists of exceptional creativity who have 
an MD or PhD degree and who are post-doctoral fellows in the neurosciences in a medical 


school. 


These awards are given as transitional awards before recipients become a member of a 
faculty at the Assistant Professor level or higher. Deans of medical schools are invited to 


submit one nomination. 


Please see web page www.warrenalpertfoundation.org for details. 


Applications are due on January 15, 2019. 


\\ Stony Brook 
University 


Multiple Postdoctoral 
Positions 
Fall/Winter 2018 


Stony Brook University is recruiting 
for multiple postdoctoral positions 
in various sub-specialties, for the 

upcoming fall and winter months. 


Stony Brook has been characterized 
by innovation, energy and progress, 
and making ground-breaking 
discoveries since its beginning half 
a century ago. 


Any interested candidates are 
invited to visit our JOBS page. 


www.stonybrook.edu/postdocjobs 
Stony Brook University/SUNY is 


an equal opportunity, affirmative 
action employer. 


Pfizer Worldwide Research and Development Postdoctoral Program 


At Pfizer, postdocs are trained to become successful, independent investigators, 
capable of formulating and addressing important scientific hypotheses. In addition, 
trainees receive broad exposure to the process of drug discovery, from idea to 
clinical trials. Areas of scientific focus include cardiovascular and metabolic 
diseases, comparative medicine, drug safety, biotherapeutics/protein engineering, 
inflammation and immunology, medicinal chemistry, oncology, pharmacology, 
vaccines, and clinical, computational, and genomic sciences. 


We recruit highly motivated Ph.D. recipients with an outstanding track record of 
scientific productivity and a passion for ground-breaking, fast-paced research that 
facilitates the development of innovative therapies for human diseases. Our program 
promotes dissemination of research results through publications and participation in 
scientific meetings, provides opportunities for collaboration with leading academic 
labs and industry consortia, and offers exceptional professional development 


training and networking opportunities. 
Gi Own! 


To explore our program and research, visit us online at: 
www.pfizer.com/careers/en/postdoctoral-program 


pfi zercareers.com 


online @sciencecareers.org 


ScienceCareers 


POSTDOCTORAL OPPORTUNITIES 
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= Lawrence Livermore 
National Laboratory 


Our Next Breakthrough IS YOU 


<a 
Lawrence Postdoctoral Fellowship 
The Opportunity to Bring your Brightest Ideas to Life 


We know that you are already working hard to solve important research questions. But do 
you want to take your skills to the next level and apply them to solving the nation’s most 
pressing problems in national security? The Lawrence Livermore National Laboratory (LLNL) 
has openings available in the Lawrence Fellowship Program that will allow you to do just 
that. We want you to apply for this prestigious fellowship, which offers you the freedom to 
conduct the independent, self-directed, cutting-edge research that you have always dreamed 
about. Fellowships are awarded to applicants with extraordinary talent, credentials, leadership 
potential and a track record of research accomplishments. Is that you? 


Successful Fellows will propose and subsequently perform creative research in an area that is relevant 
to the mission and goals of LLNL. Broad topic areas include: Physics, Applied Mathematics, Computer 
Science, Chemistry, Material Science, Engineering, Environmental Science, Atmospheric Science, 
Geology, Energy, Lasers and Biology. You will be able to participate in experimental or theoretical 
work at LLNL and will have access to LLNL’s extensive computing facilities and specialized laboratory 
facilities. The duration of the Fellowship is up to three years. The salary is $9,476/mo. 


Please refer to the following web page hittp://apptrkr.com/1255150 for eligibility requirements and 
instructions on how to apply. When applying and prompted, please mention where you saw this ad. 
The deadline for applications is October 1, 2018. LLNL is operated by the Lawrence Livermore National 
Security, LLC for the U.S. Department of Energy, National Nuclear Security Administration. We are an 


~ The Hollings Cancer Center 
{ MUSC seeks applications for the T32 
jovumes cavern ees, Integrative Training in Oncogenic 
Signaling (ITOS) Postdoctoral 
Fellow Program. The goal of 
the Program is to train competitive postdoctoral 
trainees that will represent the next generation 
of cancer researchers. Selected fellows will be 
provided with an outstanding research and academic 
environment and professional opportunities 
including exposure to a wide variety of biological 
systems, approaches, and technologies in the study 
and translation of basic cellular processes involved 
in the development of cancer. ITOS Fellows will 
also have access to the most modern types of high- 
resolution imaging, advanced microscopy, genome- 
level profiling, proteomics, gene manipulation and 
cell tracking, as well as exposure to systems biology 
and bioinformatics. Fellows can be supported in the 
Program for two years. 


Eligible candidates for support as Hollings Cancer 

Center T32 ITOS Program Fellows must meet the 

following requirements: 

¢ Have a doctoral degree in a relevant discipline 
from an accredited domestic or foreign 
educational institution. 

* Be a US citizen or have verifiable status as a 
permanent resident. 


The ITOS program will prioritize candidates 
seeking their first postdoctoral experience, although 
a second postdoctoral training experience may be 
considered if there is an outstanding candidate 
changing his or her focus to oncogenic signaling. 


To apply, please visit this website: http://www. 
hollingscancercenter.org/research/membership- 


equal opportunity employer with a commitment to workforce diversity. 


LLNL is an affirmative action / equal opportunity employer. 


opportunities/t32/eligibility-app.html. For 
inquiries, please contact: Jill Ussery at usseryj@ 
musc.edu or (843) 792-4203 


Dukeww sens 


regenerationN! 


REGENERATION NEXT POSTDOCTORAL FELLOWSHIP 
PROGRAM, DUKE UNIVERSITY 


Regeneration Next is a campus-wide initiative to stimulate high impact research 
that crosses disciplinary boundaries in regenerative biology and medicine. We 
announce a three-year postdoctoral fellowship program for research at Duke 
University, awarding a stipend averaging $55K/year, health insurance benefits, 
and $10K/year for research and travel expenses. Candidates who have recently 
completed or soon expect to complete their PhD or MD/PhD degrees at US or 
international institutions are encouraged to apply. 


Accomplished candidates should identify a Regeneration Next-affiliated lab(s) 
of interest and contact the principal investigator to apply for a postdoc position. 
Faculty lead cutting edge programs in developmental and regenerative biology, 
stem cells, imaging, mechanobiology, gene editing, tissue engineering, and 
related areas. A full list is available at regenerationnext.duke.edu under 
“Faculty”. 


Candidates will need to apply directly to the Faculty lab to obtain an interview. 
Applicant and Faculty sponsor will submit a brief application for a Regeneration 
Next fellowship. Details and application instructions can be found at: 
regenerationnext.duke.edu under “Postdoc.” Awards are competitive and will 
be judged and awarded on a rolling basis until slots are filled. Applications will 
be accepted beginning October 1, 2018. 


Questions may be directed to: Ken Poss, Director (regeneration@duke.edu). 
Duke University is an Affirmative Action/Equal Opportunity Employer 
committed to providing employment opportunity without regard to an 
individual s age, color, disability, genetic information, gender, gender 

identity, national origin, race, religion, sexual orientation, or veteran status. 


MICHIGAN STATE 
UNIVERSITY 


Postdoctoral Research Associate Posting 
Position Summary: 


The Institute for Quantitative Health Science and Engineering at 
Michigan State University is recruiting motivated postdoctoral 
fellows to advance these fields and is inviting applications from 
outstanding candidates for up to 10 Postdoctoral positions in 
the areas of biomedical research including; biomedical devices, 
biomedical imaging, chemical biology, developmental and stem 
cell biology, neuroengineering, structural biology, synthetic biology, 
systems biology, motion analytics and precision health. 


The Institute for Quantitative Health Science and Engineering 
is called IQ for its dedication to the development of intelligent 
solutions to the most-pressing biomedical quandaries facing 
scientists and clinicians. To address these important problems, we 
have gathered some of the most creative minds from around the 
country to build integrative programs that bridge disciplines and 
integrate strategies for convergent tactical solutions. 


For more information and to view faculty profiles, please visit our 
website at https://iq.msu.edu/ 


Required Degree(s): 
PhD in engineering, biology and/or related field(s), MD/PhD, or 
equivalent degree(s) 


Please visit http://careers.msu.edu to apply. (Reference posting 
#524444) 


IEEE 


IEEE Engineering in Medicine 
and Biology Society 


Advancing Technology for Humanity 


Be a part of the growing world 
of Engineers 
working in Medicine and Biology... 


www.embs.org 


Why IEEE EMBS? 

We are the world’s largest international society of engineers that work 
in the biomedical community. The organization's 10,000 members 
reside in some 97 countries around the world. Whether you are an EE 
in Bio, a biomedical, mechanical or chemical engineer or a clinician 
interested in the latest technology ... there is a place for you in our 
society. 


EMBS provides its members with access to the people, practices, 
information, ideas and opinions that are shaping one of the fastest 
growing fields in science to make an impact in advancing technology 
for humanity. 


Build a Network 

IEEE EMB houses an unrivaled network of professionals, experts, and 
advisors that can help shape your career, offer resources to acquire 
new skills, advance your professional development, and provide 
numerous opportunities for involvement, recognition, and reward. 


Collaborate & Innovate 
Across the globe with IEEE EMB colleagues either online or in person 
to build support group for your profession, industry or project 


Get Ahead 
Be “in the know” with the latest research, technology trends, industry 
news, and local events from IEEE EMB Journals and Conference 
Proceedings 


Be a Member 

Make a global impact .... Membership gives you access to the people, 
practices, information, ideas and opinions that are shaping one of the 
fastest growing fields in science! 


Join us in advancing technology for humanity 


Career Feature: 
Artificial Intelligence 


Issue date: November 30 
Book ad by November 15 
Ads accepted until Nov 21 if 


space allows 


To book your ad: 
advertise@sciencecareers.org 


The Americas 

+ 202 326 6577 

Europe 

+44 (0) 1223 326527 
Japan 

+813 6459 4174 
China/Korea/Singapore/ 
Taiwan 

+86 131 4114 0012 


Produced by the Science/AAAS 
Custom Publishing Office. 


ScienceCareers 


BYAAAS 


SCIENCECAREERS.ORG 
THERE’S ONLY ONE SCIENCE. 


Why choose this Al Feature 
for your advertisement? 


# Relevant ads lead off the 
career section with a special 
“Al” banner 


# Link on the job board 
homepage directly to Al jobs. 


Science 


Postdoctoral Fellow 
RUTGERS 


Genomics Sequencing 
Core Facility 
Waksman Institute of 
Microbiology 


The Waksman Institute at Rutgers University invites applications for a 
Postdoctoral Fellow for Genomics Sequencing, with a tentative starting 
date of October 1, 2018. 

We are seeking individuals experienced in analyzing high thorough-put 
sequencing data on next generation genome sequencers. The analysis entails 
assessing the quality of the data, processing of the data through appropriate 
analysis pipelines, determining the quality of the analysis and whether further 
analysis should be done, and assembling results for researchers. 


The Waksman Institute is home to over 15 faculty members who use a 
broad range of approaches and experimental systems in numerous well- 
funded research programs. The Institute is part of a vibrant and interactive 
life sciences community that includes the School of Environmental and 
Biological Sciences, School of Arts and Sciences Division of Life Sciences, 
the Center for Advanced Biotechnology and Medicine, the Cancer Institute 
of New Jersey, the Human Genetics Institute of New Jersey, and the Robert 
Wood Johnson Medical School. A leading research university, Rutgers is a 
member of the AAU and CIC. For more information, please visit our website: 
https://waksman.rutgers.edu. 

Applicants must have a Ph.D. in Bioinformatics, Statistics genetics, Computer 
science and/or population genetics. Experience in genomics analysis, next 
generation sequencing, and pipeline/database development. Proficiency 
in Unix/Linux environment, and with at least one programming language 
(Python, R, Perl, Java, C/C++ etc.) The candidate must have excellent 
knowledge and experience with large scale biological data analyses 
especially high-throughput sequencing data. Candidates should submit a 
CV, cover letter, transcript and letters of reference to: https://jobs.rutgers. 
edu/postings/73390. For consideration, applications must be submitted 
electronically. 


Rutgers is an Equal Opportunity/Affirmative Action Employer. For 
additional information please see the Non-Discrimination Statement 
at: http://uhr.rutgers.edu/non-discrimination-statement. 


74) 
$ 
— 
tS 
) 
SS 
A 
= 
a 
2 
= 


online @sciencecareers.org 


online @sciencecareers.org 


ok 
2 
tS 
=) 
e 
=) 
= 
— 
Y 
Fp} 


Jefferson Science Fellowship 


FROM THE AMERICAN PEOPLE 


The National Academies of Sciences, Engineering, and Medicine is pleased to 
announce a call for applications for the 2019 Jefferson Science Fellows (JSF) 
program. Initiated by the Secretary of State in 2003, this fellowship program 
engages the American academic science, technology, engineering and medical 
communities in the design and implementation of U.S. foreign policy and 
international development. 


Jefferson Science Fellows spend one year on assignment at the U.S. Department 
of State or the U.S. Agency for International Development (USAID) as science 
advisors on foreign policy/international development issues. Assignments are 
tailored to the needs of the hosting office, while taking into account the Fellows’ 
interests and areas of expertise. 


The fellowship is open to tenured, or similarly ranked, academic scientists, 
engineers, and physicians from U.S. institutions of higher learning. Applicants 
must hold U.S. citizenship and will be required to obtain a security clearance 
prior to beginning the fellowship. 


The deadline for applications for the 2019-2020 program year is October 31, 
2018. To learn more about the Jefferson Science Fellows program and to apply, 


NRC Research Associateship Programs 
The National Academy of Sciences, Engineering, and Medicine offers 
postdoctoral and senior research awards on behalf more than 20 U.S. federal 
research agencies and affiliated institutions with facilities at over 100 locations 
throughout the U.S. and abroad. 
We are actively seeking highly qualified candidates including recent doctoral 
recipients and senior researchers. Applications are accepted during four annual 
review cycles (with deadlines of November 1, February 1, May 1, and August 1,). 
Awardees have the opportunity to: 
conduct independent research in an area compatible with the interests of 
the sponsoring laboratory 
devote full-time effort to research and publication 
access the excellent and often unique facilities of the federal research 
enterprise 
collaborate with leading scientists and engineers at the sponsoring 
laboratories 
Benefits of an NRC Research Associateship award include: 
* | year award, renewable for up to 3 years 
+ Stipend ranging from $45,000 to $80,000, higher for senior researchers 
¢ Health insurance, relocation benefits, and professional travel allowance 
Applicants should hold, or anticipate receiving, an earned doctorate in science or 
engineering. Degrees from universities abroad should be equivalent in training 
and research experience to a degree from a U.S. institution. Some awards are 
open to foreign nationals as well as to U.S. citizens and permanent residents. 
The National Academies of Sciences, Engineering, and Medicine’s 
Fellowships Office has conducted the NRC Research Associateship 


visit www.nas.edu/jsf Programs in cooperation with sponsoring federal laboratories and other 
research organizations approved for participation since 1954. Through 
national competitions, the Fellowships Office recommends and makes NRC 
Research Associateship awards to outstanding postdoctoral and senior 
scientists and engineers for tenure as guest researchers at participating 
laboratories. A limited number of opportunities are available for support 


of graduate students in select fields. 


The National Academies of 
SCIENCES * ENGINEERING + MEDICINE 


The Jefferson Science Fellows program is administered by the National 
Academies of Sciences, Engineering, and Medicine and supported by the 
U.S. Department of State and the United States Agency for 
International Development. 


The National Academies of 


SCIENCES * ENGINEERING * MEDICINE 


POSTDOC OPPORTUNITIES 


PHYSICAL CHEMISTRY 
FACULTY POSITIONS 
Boston College 
Chemistry Department 
The Chemistry Department of 
Boston College invites applications for two 
tenure-track positions to be effective in the fall of 
2019. Applicants will be evaluated based on their 
potential to establish a prominent and well-funded 


Science Careers 


The University of Georgia FROM THE JOURNAL SCIENCE TAVAAAS 


Multiple Postdoctoral positions available in 


Genetics at the University of Georgia in diverse 
areas of molecular and population genetics and 
genomics. http://www.genetics.uga.edu/; 
https://postdocs.uga.edu/. 


The Ye lab studies genetic adaptation to diet 
during human evolution and the genetic basis 
of complex metabolic diseases. Experience in 
population genomics or bioinformatics preferred 
Contact: Kaixiong (Calvin) Ye, Kaixiong. Ye@ 
uga.edu 


The White lab studies mechanisms underlying 


the evolution of young sex chromosomes in the 
threespine stickleback fish. Contact: Dr. Mike 
White, whitem@uga.edu 


The Goll lab studies the molecular bases of 
heterochromatin establishment during embry- 
onic development. Experience with RNA-seq 
and ChIP preferred. Contact: Dr. Mary Goll, 
Mary.Goll@uga.edu 


The Sweigart and Parrott labs are developing 
new transgenic approaches in the wildflower 
genus Mimulus (monkeyflower). Contact: Dr. 
Andrea Sweigart, sweigart@uga.edu 


The Terns lab studies the basic biology and 
applications of CRISPR-based prokaryotic 
anti-viral immune systems. Contact: Dr. Mike 
Terns, mterns@uga.edu 


research program and to excel in teaching at the 
graduate and undergraduate levels. Successful 
applicants will join a department of approximately 
120 doctoral students, 30 postdoctoral fellows, 
200 undergraduate majors, and an internationally 
recognized faculty. 

Assistant Professor in the area of Physical 
Chemistry with a focus on computational/ 
theoretical chemistry requires a Ph.D. in Chemistry 
or related areas; postdoctoral experience is 


desirable but not required. Exceptional candidates 
with expertise in other areas are encouraged 
to apply, as well. The candidates are expected 
to have published in top-refereed journals and 
demonstrated the ability to perform outstanding 
independent research. 


Interested applicants must submit a cover letter 
(which includes the names of three references), 
a graphical executive summary of research plans 
(one page), curriculum vitae, a summary of 
research plans (eight pages maximum), a statement 
of teaching philosophy and arrange to have three 
letters of reference submitted via the online faculty 
application at: apply.interfolio.com/53188. 
All application materials must be submitted 
electronically on or prior to October 15, 2018. 

Boston College, a university of eight schools and 
colleges, is an Equal Opportunity Employer and 

supports Affirmative Action. 


Follow us for jobs, 
career advice 
and more! 


@ScienceCareers 
/ScienceCareers 
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TENURE-TRACK ASSISTANT PROFESSOR 
PHYSICAL CHEMISTRY 


Harvard University Faculty of Arts and Sciences 
Department of Chemistry and Chemical Biology 


Position Description: Candidates are invited to apply for a tenure-track 
assistant professorship in physical chemistry, broadly defined, including 
experimental and theoretical research in areas such as but not limited to atomic 
and molecular physics, biophysical chemistry, condensed matter, quantum 
science and ultrafast spectroscopy. The appointment is expected to begin on 
July 1, 2019. The tenure-track professor will be responsible for teaching at 
the undergraduate and graduate levels. We are seeking candidates who have 
an outstanding research record and a strong commitment to undergraduate 
and graduate teaching. 


Basic Qualifications: Doctorate or terminal degree in chemistry or related 
discipline required by the time the appointment begins. 


Additional Qualifications: Demonstrated experience in teaching is desired. 


Special Instructions: Please submit the following materials through the ARIeS 
portal (http://academicpositions.harvard.edu/postings/837 1). Applications must 
be submitted no later than October 15, 2018. 
1. Cover letter 
2. Curriculum Vitae with publications list 
3. Teaching statement (describing teaching approach and philosophy) 
4. Outline of future research plans 
.Names and contact information of 3-5 references. Three letters of 
recommendation are required, and the application is complete only when 
all three letters have been received. 
6. Selected publications 


Contact Information: Susan M. Kinsella, Search Administrator, Department 
of Chemistry and Chemical Biology, Faculty of Arts and Sciences, Harvard 
University, 12 Oxford St., Cambridge, MA 02138. Phone: 617-496-4088. 
kinsella@chemistry.harvard.edu 


Harvard is an Equal Opportunity Employer and all qualified applicants will 
receive consideration for employment without regard to race, color, religion, 
sex, national origin, disability status, protected veteran status, gender 
identity, sexual orientation, pregnancy and pregnancy-related conditions, or 
any other characteristic protected by law. 


eS 


SCHOOL OF 


MEDICINE & 
DENTISTRY 


UNIVERSITY ROCHESTER 
MEDICAL CENTER 


Tenure-Track/Tenured Faculty Position 


The Center for Oral Biology in the Eastman Institute for Oral Health 
invites applications for a faculty position at the early or mid-career level. 
Successful applicants should have a PhD, MD, DDS, or combined degrees, 
and demonstrated ability to conduct an innovative research program to 
investigate an area of science relevant to human disease/oral biology; 
including: tooth and craniofacial development; salivary gland biology; 
orofacial pain; or, oral bacteriology/immunology. Preference will be 
given to applications that complement ongoing programs or bring novel 
expertise and research perspectives. Individuals seeking an appointment 
must have demonstrated the ability to conduct independent research. The 
Center of Oral Biology is located in the state-of-the-art Arthur Kornberg 
Medical Research Building at the University of Rochester School of 
Medicine and Dentistry. Faculty members in the Center carry joint 
appointments in appropriate academic departments and participate in 
graduate student training in several graduate programs at the University 
of Rochester. More information about the Center and available positions 
can be found on the internet (http://www.urmc.rochester.edu/center- 
oral-biology/ ). 


For further details and to apply online, please go to: http://www. 
rochester.edu/working/hr/jobs/ (Job ID #203687). Please provide your 
curriculum vitae, statement of current and future research interests, and 
names and addresses of at least three references. 


The University of Rochester is an Equal Opportunity Employer. 
Women and minorities are encouraged to apply. 


Step up your job search 


with Science Careers ew" 


Q Search ScienceCareers.org today 
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Yale University 
School of Medicine 


FACULTY POSITION AT THE ASSISTANT 
PROFESSOR LEVEL 


DEPARTMENT OF CELLULAR AND 
MOLECULAR PHYSIOLOGY 


The Department of Cellular and Molecular Physiology is 
conducting a search for new faculty members at the assistant 
professor level. 


The search seeks candidates whose research connects the 
properties of molecules to the properties of physiological 
systems. 


Excellent opportunities are available for collaborative research, 
as well as for graduate and medical student teaching. Candidates 
must hold a Ph.D., M.D., or equivalent degree. Applicants should 
include a curriculum vitae, a statement of research interests and 
goals, and should arrange to have three letters of reference sent. 
Applicants should apply at the following website: 
http://apply.interfolio.com/53471 


Application Deadline: October 19, 2018 


Yale University is an Affirmative Action/Equal Opportunity 
Employer and welcomes applications from women, persons 
with disabilities, covered veterans, and members of 
minority groups. 


online @sciencecareers.org 
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The detour that became a shortcut 


ike many science students, I had always envisioned a pretty straightforward career path: 

a graduate degree, postdoctoral research, and—if all went well—a faculty position. But I was 

thrown off this track before I even completed my bachelor’s degree in biology. A university strike 

delayed my graduation, and as a result I missed the graduate school application deadline. Sud- 

denly I had no idea what my future might hold, and I needed to make a living. I was relieved to 

be offered a job managing a newly established conservation area in my home state of Sergipe in 
Brazil, and I was excited about working to support biodiversity. But in the back of my mind, I worried 
that the job would take me in the wrong direction, away from the academic career I still desired. 


The idea of managing a protected 
area was appealing, but my every- 
day workload was far from inspir- 
ing. I handled some interesting 
challenges, such as reaching a com- 
promise with the ranchers whose 
cattle needed to cross the reserve 
for water. But I spent more time on 
paperwork and meetings than on 
ecosystems and biodiversity. And 
the only opportunities for career 
advancement were administrative 
positions, one step away from be- 
coming a career bureaucrat. That 
was not how I wanted to spend 
my life. 

So, 3 years in, I decided that I 
needed to make a change. I had 
managed to complete a master’s 
degree in ecology and conservation 
on the side while working at my day 
job, and in my spare time, I studied 
the reserve’s frogs. But it was time to get back on the aca- 
demic ladder full time. Applying to Ph.D. programs was the 
obvious next step. 

When I was accepted into a program in ecology and evo- 
lution, I couldn’t wait to trade government paperwork for 
the intellectual stimulation of being fully immersed in re- 
search. Yet I was a bit unsure how well my transition back 
to academia would go. Would the skills I developed during 
my years at the reserve be of any use in my new endeavor, 
or would I be hopelessly rusty and lost? 

At first, as I had feared, I felt a little behind my fellow 
students. Despite the supportive environment, I couldn’t 
escape the fact that I lacked skills vital to my new re- 
search field, such as programming and advanced statistics. 
I doubted that I would ever make any progress in my re- 
search or produce a decent thesis. 

But I soon realized that, during my time at the reserve, 
I had developed my own valuable skills. Managing the 


“I worried that the job 
would take me 
in the wrong direction.” 


conservation area, which relied on 
community participation and com- 
promise, had taught me to work 
collaboratively. Through juggling 
reserve management, community 
meetings, and endless paperwork, 
I had learned to work creatively 
and, above all, to get things done. 
I soon realized that doing multi- 
variate analyses was no harder than 
dealing with the multidimensional 
problems of reserve management, 
and that writing scientific papers 
was no more demanding than 
compiling environmental policy 
reports. And my collaborative ap- 
proach served me well as I worked 
closely with my new peers. In time, 
I gained the confidence I needed 
to succeed. 

Three years after starting my 
Ph.D., I found what I hoped would 
be my next career step: a permanent faculty position at my 
alma mater. As I went into overdrive to finish my thesis and 
put together a compelling application, I drew on abilities 
honed during my time managing the conservation 
area—including meeting deadlines and multitasking 
effectively—to wrap up my degree and land the job. 

Looking back, I appreciate how my precocious experience 
as a reserve administrator has contributed to my progress 
in academia. I had been thrown into the deep end, alone at 
a completely new reserve, where I was expected to mediate 
conflicts and solve problems with next to no resources. In 
turn, I developed creativity, persuasiveness, and patience. 
My initial detour from my academic goals ended up being 
a shortcut to the career I have always wanted. 


Sidney F. Gouveia is an adjunct professor at the Federal 
University of Sergipe in Sao Crist6vdao, Brazil. Send your 
career story to SciCareerEditor @aaas.org. 
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